How to track Icecast2 visits with Matomo? - matomo

My beloved web radio has an icecast2 instance and it just works. We have also a Matomo instance to track visits on our WordPress website, using only Free/Libre and open source software.
The main issue is that, since Matomo tracks visits via JavaScript, direct visits to the web-radio stream are not intercepted by Matomo as default.
How to use Matomo to track visits to Icecast2 audio streams?

Yep it's possible. Here my way.
First of all, try the Matomo internal import script. Be sure to set your --idsite= and the correct path to your Matomo installation:
su www-data -s /bin/bash
python2.7 /var/www/matomo/misc/log-analytics/import_logs.py --show-progress --url=https://matomo.example.com --idsite=1 --recorders=2 --enable-http-errors --log-format-name=icecast2 --strip-query-string /var/log/icecast2/access.log
NOTE: If you see this error
[INFO] Error when connecting to Matomo: HTTP Error 400: Bad Request
In this case, be sure to have all needed plugins activated:
Administration > System > Plugins > Bulk plugin
So, if the script works, it should start printing something like this:
0 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
Parsing log /var/log/icecast2/access.log...
1013 lines parsed, 200 lines recorded, 99 records/sec (avg), 200 records/sec (current)
If so, immediately stop the script to avoid to import duplicate entries before installing the definitive solution.
To stop the script use CTRL+C.
Now we need to run this script every time the log is rotated, before rotation.
The official documentation suggests a crontab but I don't recommend this solution. Instead, I suggest to configure logrotate instead.
Configure the file /etc/logrotate.d/icecast2. From:
/var/log/icecast2/*.log {
...
weekly
...
}
To:
/var/log/icecast2/*.log {
...
daily
prerotate
su www-data -s /bin/bash --command 'python2.7 ... /var/log/icecast2/access.log' > /var/log/logrotate-icecast2-matomo.log
endscript
...
}
IMPORTANT: In the above example replace ... with the right command.
Now you can also try it manually:
logrotate -vf /etc/logrotate.d/icecast2
From another terminal you should be able to see its result in real-time with:
tail -f /var/log/logrotate-icecast2-matomo.log
If it works it means everything will work perfectly and automatically, importing all visits every day, without any duplicate and without missing any lines.
More documentation here about the import script itself:
https://github.com/matomo-org/matomo-log-analytics
More documentation here about logrotate:
https://linux.die.net/man/8/logrotate

Related

Programmatically get web request initiator

The Chrome Dev Tools network tab has an initiator column that will show you exactly what code initiated the network request.
I'd like to be able to get network request initiator information programmatically, so I could run a script with a url and request search string argument, and it would return details about where every request with a url matching request search string came from on the page at url. So given the arguments www.stackoverflow.com and google the output might look something like this (showing requesting url, line number, and requested url):
/ 19 http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js
/ 4291 http://www.google-analytics.com/analytics.js
I looked into PhantomJS, but its onResourceRequested callback doesn't provide any initiator information, or context from which it can be derived, according to the documentation: http://phantomjs.org/api/webpage/handler/on-resource-requested.html
Is it possible to do with with PhantomJS at all, or some other tool or service such as selenium?
UPDATE
From the comments and answers so far it seems as though this isn't currently supported by Phantom, Selenium or anything else. So here's an alternative approach that might work: Load the page, and all of the assets, and then find any occurrences of request search string in all of the files. How could I do that?
You should file a feature request in the issue tracker against the DevTools. The initiator information is not exported in the HAR, so getting it out of there isn't going to work. As far as I know, no existing API allows for this either.
I've been able to implement a solution that uses PhantomJS to get all of the URLs loaded by a page, and then use a combination of xargs, curl and grep to find the search string at those URLs.
The first piece is this PhantomJS script, which simply outputs every URL requested by a page:
system = require('system');
var page = require('webpage').create();
page.onResourceRequested= function(req) {
console.log(req.url);
};
page.open(system.args[1], function(status) {
phantom.exit(1);
});
Here it is in action:
$ phantomjs urls.js http://www.stackoverflow.com | head -n6
http://www.stackoverflow.com/
http://stackoverflow.com/
http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js
http://cdn.sstatic.net/Js/stub.en.js?v=06bb9dbfaca7
http://cdn.sstatic.net/stackoverflow/all.css?v=af4b547e0e9f
http://cdn.sstatic.net/img/share-sprite-new.svg?v=d09c08f3cb07
For my problem I'm not interested in images, and those can be fitlered out by adding the phantomjs arg --load-images=no.
The second piece is taking all of the URLs and searching them. It's not enough to just output the match, I also need the context around which URL was matched, and ideally which line number too. Here's how to do that:
$ cat urls | xargs -I% sh -c "curl -s % | grep -E -n -o '(.{0,30})SEARCH_TERM(.{0,30})' | sed 's#^#% #'"
We can wrap this all up in a little script, where we'll pipe the output back through grep to get color highlighting on the search string:
#!/bin/bash
phantomjs --load-images=no urls.js $1 | xargs -I% sh -c "curl -s % | grep -E -n -o '(.{0,30})$2(.{0,30})' | sed 's#^#% #' | grep $2 --color=always"
We can then use it to search for any term on any site. Here we're looking for adzerk.net on stackoverflow.com:
So you can see that the adzerk.net request gets initiated somewhere around line 4158 of the main stackoverflow page. It's not a perfect solution because the invocation might be somewhere completely different from where the URL is defined, but it's probably a close, and certainly a good point to start tracking down the exact invocation site.
There might be a better way to search the contents of each URL. It doesn't look like PhantonJS's onResourceReceived handler currently exposes the resource content, but there is ongoing work to address that, and once that's available all of this will be much simpler.
You can use Chrome's debugger protocol from a process external to Chrome or use the chrome.debugger API in a Chrome extension (see How to retrieve the Initiator of a request when extending Chrome DevTool?).

Trouble evoking lua stored procedure using evalsha from redis

I am trying to use lua scripts stored in Redis as stored procedures.
I would like to be able to store these scripts in Redis once, and look them up and evoke them when needed.
I have been able to add these functions to the :function: keyspace, using the redis-cli to add them, as follows,
redis-cli
> SET :function:f1 "redis.call('SELECT', 0);local data=redis.call('HGETALL','key:{'..ARGV[1]..'}'); print('f1'); print(ARGV[1]); return data;"
> SET :function:f2 "redis.call('SELECT', 0); local data=redis.call('HGETALL','key:{'..ARGV[1]..'}'); print('f2'); print(ARGV[1]); return data;"
> SET :function:f3 "redis.call('SELECT', 0);local data=redis.call('HGETALL','key:{'..ARGV[1]..'}'); print('f3'); print(ARGV[1]); return data;"
I have also been able to use the following script load command to build a script that can look up these commands,
SCRIPT LOAD "local f=loadstring(redis.call('get',':function:' .. KEYS[1]));return f()"
This script load command provides me with an SHA key which I can use to call one of these stored functions, which I can run from the command line, like so,
redis-cli SCRIPT LOAD "local f=loadstring(redis.call('get',':function:' .. KEYS[1]));return f()"
#returns:
"31b98f9ad6a416c27e5af91ff4af12235d4da385"
Then I can call one of the functions from redis-cli,
redis-cli
> evalsha 31b98f9ad6a416c27e5af91ff4af12235d4da385 1 f3 1234567890
But I keep getting an error,
(error) ERR Error running script (call to f_ae7d0c88e2be3f907cc9a4f5943817bc380bf68e): #user_script:1: user_script:1: bad argument #1 to 'loadstring' (string expected, got boolean)
Any ideas? suggestions?
You'll have to mangle the KEYS or the redis. namespace.
Josiah Carlson just released a python package for this.
See: here and here.
Josiah also added the package to Pypi
Hope this helps, TW

Import old apache access logs to webalizer - ignoring records

I installed webalizer on my apache 2 webserver yesterday and came across the problem, that all the old access logs are not used. The directory list looks like that:
/var/log/apache2/
access.log
access.log1
access.log.10.gz
access.log.11.gz
...
How can I import all my files at once?
I tried several things, but it was telling me, that the records were ignored.
Hope somone can help. Thanks!
I ran into the same problem. I had just installed webalizer, and changed it to incremental mode (here are the relevant entries from my /etc/webalizer/webalizer.conf):
LogFile /var/log/apache2/access.log.1
OutputDir /var/www/htdocs/w
Incremental yes
IncrementalName webalizer.current
And then I ran webalizer by hand, which initialized the non-gz files in my logs directory. After that, any attempt to manually import an older gz logfile (by running webalizer /var/log/apache2/access.log.2.gz for instance) resulted in all of the entries being ignored.
I suspect this is because the entries found in the gz logs were older than the last import- I had to delete my webalizer.current file (really I cleared the whole dir- either way should work). Finally, in reverse order (oldest first), I could import the old gz files one at a time:
bhs128#home:~$ cd /var/log/apache2
bhs128#home:/var/log/apache2$ sudo rm -rf /var/www/htdocs/w/*
bhs128#home:/var/log/apache2$ ls -1t /var/log/apache2/access.log*gz | grep -o [0-9]* | tail -n1
52
bhs128#home:/var/log/apache2$ for i in {52..2}; do webalizer /var/log/apache2/access.log.$i.gz; done
I just had the same problem, and I took a look into the webalizer.current file:
$ head -n 2 webalizer.current
# Webalizer V2.21-02 Incremental Data - 11/05/2019 22:29:02
2019 11 5 22 29 2
The second line seems to contain the timestamp of the last run, so I just changed the year to 2018. After that, I was able to import older log files than the last imported ones, without having to delete all the data first.

How do I test a manual check in check_mk / Nagios

My organization is using Nagios with the check_mk plugin to monitor our nodes. My question is: is it possible run a manual check from the command line? It is important, process-wise, to be able to test a configuration change before deploying it.
For example, I've prepared a configuration change which uses the ps.perf check type to check the number of httpd processes on our web servers. The check looks like this:
checks = [
( ["web"], ALL_HOSTS, "ps.perf", "Number of httpd processes", ( "/usr/sbin/httpd", 1, 2, 80, 100 ) )
]
I would like to test this configuration change before committing and deploying it.
Is it possible to run this check via the command line, without first adding it to main.mk? I'm envisioning something like:
useful_program -H my.web.node -c ps.perf -A /usr/sbin/httpd,1,2,80,100
I don't see any way to do something like this in the check_mk documentation, but am hoping there is a way to achieve something like this.
Thanks!
that is easy to check.
Just make your config changes and then run:
cmk -nv HOSTNAME.
That (-n) will try run everything and return (-v) the output.
So can see the same results like later in the GUI.
List the check
$check_mk -L | grep ps.perf
if it listing ps.perf then run following command,
$check_mk --checks=ps.perf -I Hostname

tried to add a new update.secondary hook to my repos in gitolite and now git push fails

remote: Undefined subroutine &main::repo_rights called at hooks/update line 41.
remote: error: hook declined to update
I have removed the update hook from all of my repos in order to get around this, but I know that they are now wide open.
I ran gl-setup, and I may have mixed versions of gitolite on my machine. I am afraid that I ran the gl-setup from a version that is different than the one I am running currently. I am not sure how to tell. Please help. :-(
Update, for a more recent version of Gitolite (namely a V3.x or more), the official documentation would be: "adding your own update hooks", and it uses VREFs (virtual refs).
add this line in the rc file, within the %RC block, if it's not already present, or uncomment it if it's already present and commented out:
LOCAL_CODE => "$ENV{HOME}/local",
copy your update hook to a subdirectory called VREF under this directory, giving it a suitable name (let's say "crlf"):
# log on to gitolite hosting user on the server, then:
cd $HOME
mkdir -p local/VREF
cp your-crlf-update-hook local/VREF/crlf
chmod +x local/VREF/crlf
in your gitolite-admin clone, edit conf/gitolite.conf and add lines like this:
- VREF/crlf = #all
to each repo that should have that "update" hook.
Alternatively, you can simply add this at the end of the gitolite.conf file:
repo #all
- VREF/crlf = #all
Either way, add/commit/push the change to the gitolite-admin repo.

Resources