I have been using SafeSquid-SWG for at least a year now, but I still face problems analyzing SafeSquid-SWG Logs.
SafeSquid contains a lot of different logs with different formats. Also, the main logging that I am interested in is the Extended Log (similar to access log of Squid Proxy Server), It includes a lot more important information other than a typical access log file of Squid Proxy Server.
-------------------------------------------------------
SafeSquid-SWG Extended Log Format:
-------------------------------------------------------
"record_id" "client_id" "request_id" "date_time" "elapsed_time" "status" "size" "upload" "download" "bypassed" "client_ip" "username" "method" "url" "http_referer" "useragent""mime" "filter_name" "filtering_reason" "interface" "cachecode" "peercode" "peer" "request_host" "request_tld" "referer_host" "referer_tld" "range" "time_profiles" "user_groups" "request_profiles" "application_signatures" "categories" "response_profiles" "upload_content_types" "download_content_types" "profiles"
--------------------------------------------------------------
SafeSquid-SWG Extended Log Sample Data:
---------------------------------------------------------------
"158166470831RXQGu" "3" "1" "14/Feb/2020:12:48:29" "276" "200" "0" "0" "0" "FALSE" "192.168.0.17" "anonymous@192.168.0.17" "CONNECT" "connect://clients1.google.com:443/" "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.100 Safari/537.36" "-" "-" "-" "192.168.248.190:8080" "UNSPECIFIED" "NONE" "-" "clients1.google.com" "google.com" "-""-" "0" "BUSINESS HOURS,YOUTUBE STREAMING HOURS" "SafeSquid Admin Group With No Auth" "Chrome,Internet Browser,Google,CLIENTS4.GOOGLE.COM,GMAIL WEB APP ALL ADDED" "Chrome,Internet Browser,Google" "Search Engines & Portals" "" "-" "-" "ENFORCE LOW PRIVACY LEVEL,PERMIT PERSONAL GOOGLE ACCOUNTS"
Why I am Not Using Excel to analyze the logs?
Depending upon the use, the log files exceeds the size of around 2GB and becomes almost impossible to do analysis using MS Excel.
Why don't I try Open Source Log Analyzer Tools like calamaris, sarg ... etc?
A Quick Answer :
- Most Opensource log analyzers (specifically Proxy Log Analyzers) accepts the Squid Access log format.
- Development, Maintenance, and Support are no longer done by the Group
- Customization is also very tedious
Yes, currently, I am using calamaris as well as sarg to get my custom reports generated.
But both their drawbacks,
----------------
Calamaris
----------------
Calamaris is like a Summary Reporting which supports Squid Format, Therefore in order to generate Calamris Reports, I need to first convert the Extended log into Access Log and then run Calamaris. which becomes tedious & time-consuming as the log file groups and also the chances of error are high
Calamaris reports do not provide the information that I am looking for. Since I need to convert Extended Logs in to access Log the important fields provided by SafeSquid-SWG are removed, and therefore the reports generated becomes meaningless.
Calamaris also have some problems and since it is no more maintained, I cannot report the problems faced.
--------
Sarg
--------
Sarg is also a problem because of the way the Reports are created, I will run out of inodes within a month.
Sarg again uses the access log, therefore, I have t again convert the logs and rest is the same as calamaris.
---------------------------
Other Log Analyzer
---------------------------
I have also tried other log analyzers and i face the same problems as listed above
A small Conclusion
No Log analyzer that I have used till now, has provided me custom log parsing capabilities.
The solution that I am looking for :
Should be able to deliver an insight of my day to day internet activities.
briefly elaborating on the reason why the site was blocked.
I need a dashboard that can provide a good view of my internet traffic and the applications that are triggering this traffic.