I am using logstash 2.3.4
I receive lignes which are basically apache logs with a small score at the end ( calculated trough machine learning, thanks to Spark ). Here's what a line look like :
hackazon.lc:80 192.168.100.133 - - [28/Jul/2016:11:07:46 +0200] "GET / HTTP/1.1" 200 10442 "http://192.168.100.123/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36" pred:0.0859964494393
As you can see the first part is an standard apache log and the end is pred:0.0859964494393.
The logs are processed by ELK for visualization, and I also want to have some metrics on the score called pred. Therefore I used the timer option from metrics. Here is my logstash config file :
input {
file {
path => '/home/spark/LogStash/*'
start_position => "beginning"
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG} pred:%{NUMBER:pred_score}"}
}
if "_grokparsefailure" in [tags] {
drop { }
}
mutate {
convert => {"pred_score" => "float"}
}
geoip {
source => "clientip"
}
metrics {
timer => ["pred_score" , "%{duration}"]
}
}
output {
# elasticsearch { }
stdout { codec => rubydebug }
# riemann{
# map_fields => true
# }
}
I expected to get output with the mean, ax etc... of the pred score. However I have only some 0, except for the count and rates.
Here is one of the output from timer :
{
"#version" => "1",
"#timestamp" => "2016-07-28T09:11:39.522Z",
"message" => "thamine-OptiPlex-755",
"pred_score" => {
"count" => 10,
"rate_1m" => 0.5533102865966679,
"rate_5m" => 1.2937302900528778,
"rate_15m" => 1.490591754983121,
"min" => 0.0,
"max" => 0.0,
"stddev" => 0.0,
"mean" => 0.0,
"p1" => 0.0,
"p5" => 0.0,
"p10" => 0.0,
"p90" => 0.0,
"p95" => 0.0,
"p99" => 0.0,
"p100" => 0.0
}
}
Do you know what I'm doing wrong ?
Thank's in advance !
Your grok pattern looks good but in your logstash script %{duration} is unknown. Neither COMBINEDAPACHELOG nor your pattern has the duration variable.
Change your timer configuration to:
timer => ["pred_score" , "%{pred_score}"]
as pred_score is the variable in your pattern
Related
I have a file that is read and split into %ojects, the %objects are populated as shown below.
$VAR1 = 'cars';
$VAR2 = {
'car1' => {
'info1' => '"fast"',
'info2' => 'boring'
},
'car2' => {
'info1' => '"slow"',
'info2' => 'boring info'
},
'car3' => {
'info1' => '"unique"',
'info2' => 'useless info'
}
};
$VAR3 = 'age';
$VAR4 = {
'new' => {
'info3' => 'rust',
'info4' => '"car1"'
},
'old' => {
'info3' => 'shiny',
'info4' => '"car2" "car3"'
}
}
};
My goal is to insert data like "car1 fast rust, car2 slow shiny, car3 unique shiny" in a DB but I can't get e.g. "rust to match based on info4 in age" ..
my $key = cars;
my $key2 = age;
foreach my $obj (keys %{$objects{$key}}) { # for every car
#info1s = $objects{$type}{$obj}{'info1'} =~ m/"(.*?)"/g; # added to clean up all info1
foreach my $infos ($info1s) {
dbh execute insert $obj $infos # this gives me "car1 fast, car2 slow, car3 unique"
}
...
Can somebody please point me in the right direction to fetch and store info4 with related info1/info2?
Thanks!
I take the objective to be as follows.
Get values for (info4) keys in $VAR4, at the deepest-level hashref, and find them as top-level keys in $VAR2 hashref. Then associate with them both a value from a (info3) key, their "sibling" in their own $VAR4's deepest level hashref, as well as the value of a key (info1) from $VAR2.
One can traverse the structure by hand for this purpose, specially if it's always with the same two levels as shown, but it's easier and better with libraries. I use Data::Leaf::Walker to get leaves (deepest values) and key-paths to them, and Data::Diver to get values for known paths.
use warnings;
use strict;
use feature 'say';
use Data::Dump;
use Data::Leaf::Walker;
use Data::Diver qw(Dive);
my $hr1 = {
'car1' => { 'info1' => 'fast', 'info2' => 'boring' },
'car2' => { 'info1' => 'slow', 'info2' => 'boring info' },
'car3' => { 'info1' => 'unique', 'info2' => 'useless info' }
};
my $hr2 = {
'new' => { 'info3' => 'rust', 'info4' => 'car1' },
'old' => { 'info3' => 'shiny', 'info4' => 'car2 car3' }
};
my $walker = Data::Leaf::Walker->new($hr2);
my %res;
while ( my ($path, $value) = $walker->each ) {
next if $path->[-1] ne 'info4';
# Some "values" have multiple needed values separated by space
for my $val (split ' ', $value) {
# Get from 'info4' path the one to its sibling, 'info3'
my #sibling_path = ( #{$path}[0..$#$path-1], 'info3' );
# Collect results: values of `info3` and `info1`
push #{$res{$val}},
Dive( $hr2, #sibling_path ),
Dive( $hr1, ($val, 'info1') );
}
}
dd \%res;
This assumes a few things and takes some shortcuts, for simplicity.
For one, I use explicit infoN keys from the questions, and the two-level structure. If data is, or can be different, this shouldn't be hard to adjust.
Next, this assumes that a value like car1 always exists as a key in the other hashref. Add an exists check for that key if it is possible that it doesn't exist as a key.
I've removed some extra quotes from data. (If that's for database entry do that when constructing the statement. If data comes in with such extra quotes it should be easy to adjust the code to take them into account.)
The above program prints
{
car1 => ["rust", "fast"],
car2 => ["shiny", "slow"],
car3 => ["shiny", "unique"],
}
(I use Data::Dump to display complex data structure, for its simplicity and default compact output.)
Update I normally roam in the latex part of stachexchange and thus we have to provide a full minimal example to reproduce issues.
So here is a full breakdown. The original description of my problem is found below.
Test DB setup, we using SQLite, create.sql:
PRAGMA foreign_keys = ON;
DROP TABLE IF EXISTS `member`;
DROP TABLE IF EXISTS `address`;
create table `member` (
`uid` VARCHAR(30) NOT NULL,
`name` VARCHAR(255) DEFAULT '',
CONSTRAINT `pk_uid` PRIMARY KEY(`uid`)
);
INSERT INTO `member` VALUES ('m1','Test 1'),('m2','Test 2');
create table `address` (
`uid` VARCHAR(30) NOT NULL,
`address_type` VARCHAR(30) NOT NULL, -- will either be work or home
`text` TEXT DEFAULT '',
CONSTRAINT `pk_uid_type` UNIQUE(`uid`,`address_type`)
CONSTRAINT `fk_uid`
FOREIGN KEY(uid)
REFERENCES member(uid)
ON DELETE CASCADE
);
INSERT INTO `address` VALUES
('m1','home','home address'),
('m1','work','work address'),
('m2','home','home address');
to be loaded into test.db via
sqlite3 test.db < create.sql
As we can see from the test data m1 has two entries in address whereas m2 has one.
Next the DBIx setup (I have no idea how to merge this into a single file, ideas a welcome as it would making the test easier). These are autogenerated via dbicdump, here I've removed alle the comments.
Schema.pm:
use utf8;
package Schema;
use strict;
use warnings;
use base 'DBIx::Class::Schema';
__PACKAGE__->load_namespaces;
1;
Schema/Result/Member.pm:
use utf8;
package Schema::Result::Member;
use strict;
use warnings;
use base 'DBIx::Class::Core';
__PACKAGE__->table("member");
__PACKAGE__->add_columns(
"uid",
{ data_type => "varchar", is_nullable => 0, size => 30 },
"name",
{ data_type => "varchar", default_value => "", is_nullable => 1, size => 255 },
);
__PACKAGE__->set_primary_key("uid");
__PACKAGE__->has_many(
"addresses",
"Schema::Result::Address",
{ "foreign.uid" => "self.uid" },
{ cascade_copy => 0, cascade_delete => 0 },
);
# I added
__PACKAGE__->might_have(
"home_address" => "Schema::Result::Address",
#{ 'foreign.uid' => 'self.uid'},
sub {
my $args = shift;
return {
"$args->{foreign_alias}.uid" => "$args->{self_alias}.uid",
"$args->{foreign_alias}.address_type" => 'home',
}
},
{ cascade_copy => 0, cascade_delete => 0 },
);
__PACKAGE__->might_have(
"home_address_alt" => "Schema::Result::Address",
{ 'foreign.uid' => 'self.uid'},
{ cascade_copy => 0, cascade_delete => 0 },
);
__PACKAGE__->might_have(
"work_address" => "Schema::Result::Address",
sub {
my $args = shift;
return {
"$args->{foreign_alias}.uid" => "$args->{self_alias}.uid",
"$args->{foreign_alias}.address_type" => 'work',
}
},
{ cascade_copy => 0, cascade_delete => 0 },
);
1;
Schema/Result/Address.pm:
use utf8;
package Schema::Result::Address;
use strict;
use warnings;
use base 'DBIx::Class::Core';
__PACKAGE__->table("address");
__PACKAGE__->add_columns(
"uid",
{ data_type => "varchar", is_foreign_key => 1, is_nullable => 0, size => 30 },
"address_type",
{ data_type => "varchar", is_nullable => 0, size => 30 },
"text",
{ data_type => "text", default_value => "", is_nullable => 1 },
);
__PACKAGE__->add_unique_constraint("uid_address_type_unique", ["uid", "address_type"]);
__PACKAGE__->belongs_to(
"u",
"Schema::Result::Member",
{ uid => "uid" },
{ is_deferrable => 0, on_delete => "CASCADE", on_update => "NO ACTION" },
);
1;
My test script:
#!/usr/bin/perl
use strict;
use warnings;
use utf8;
use open qw/:std :utf8/;
use Data::Dumper;
$Data::Dumper::Sortkeys = 1;
$Data::Dumper::Maxdepth = 0;
use Modern::Perl;
use lib qw(.);
use Schema;
BEGIN {
$ENV{DBIC_TRACE} = 1;
}
my $schema = Schema->connect(
'dbi:SQLite:dbname=test.db',
'',
'',
{
on_connect_do => 'PRAGMA foreign_keys = ON',
sqlite_unicode => 1,
RaiseError => 1,
}
);
my $row = $schema->resultset('Member')->find({ uid => 'm1'},
{
prefetch => ['home_address','work_address'],
}
);
# these are both undef
print Dumper $row->home_address;
print Dumper $row->work_address;
# using
$row = $schema->resultset('Member')->find({ uid => 'm1'},
{
prefetch => ['home_address','work_address'],
result_class => 'DBIx::Class::ResultClass::HashRefInflator',
}
);
# then
print Dumper $row;
# gives
# $VAR1 = {
# 'home_address' => undef,
# 'name' => 'Test 1',
# 'uid' => 'm1',
# 'work_address' => undef
# };
# using the "normal might_have home_address_alt in Member on m2
$row = $schema->resultset('Member')->find({ uid => 'm2'},
{
prefetch => ['home_address_alt'],
result_class => 'DBIx::Class::ResultClass::HashRefInflator',
}
);
say Dumper $row;
# does work, but only because m2 only have a single entry in Address whereas m1 has two
$row = $schema->resultset('Member')->find({ uid => 'm1'},
{
prefetch => ['home_address_alt'],
result_class => 'DBIx::Class::ResultClass::HashRefInflator',
}
);
say Dumper $row;
# which gives this warning: DBIx::Class::Storage::DBI::select_single(): Query returned more than one row. SQL that returns multiple rows is DEPRECATED for ->find and ->single and returns the first found.
The DBIC_TRACE gives
SELECT me.uid, me.name, home_address.uid, home_address.address_type, home_address.text, work_address.uid, work_address.address_type, work_address.text FROM member me LEFT JOIN address home_address ON ( home_address.address_type = ? AND home_address.uid = ? ) LEFT JOIN address work_address ON ( work_address.address_type = ? AND work_address.uid = ? ) WHERE ( me.uid = ? ): 'home', 'me.uid', 'work', 'me.uid', 'm1'
Which if you run it manually against test.db gives
m1|Test 1|m1|home|home address|m1|work|work address
So the SQL is capable of producing the correct output. But the accessors/objects whatever you want to call them, keeps being empty. I'd like to know why?
My original question:
I my data I have members and they can each have up to two addresses (home and work) both stored in the same table
So I have something similar to
Member; primary key(uid)
Address; unique(uid,address_type) # the latter is work or home
when I grab a member I'd like to prefetch the up to two addresses using might_have relationships. So in Schema::Result::Member I have
__PACKAGE__->might_have(
"home_address" => "Schema::Result::Address",
sub {
my $args = shift;
return {
"$args->{foreign_alias}.uid" => "$args->{self_alias}.uid",
"$args->{foreign_alias}.address_type" => 'home',
}
},
{ cascade_copy => 0, cascade_delete => 0 },
);
__PACKAGE__->might_have(
"work_address" => "Schema::Result::Address",
sub {
my $args = shift;
return {
"$args->{foreign_alias}.uid" => "$args->{self_alias}.uid",
"$args->{foreign_alias}.address_type" => 'work',
}
},
{ cascade_copy => 0, cascade_delete => 0 },
);
And I call it via
my $row = $self->schema->resultset('Member')
->find({uid => $uid},
{
prefetch => [qw/home_address work_address/],
});
As far as I can see from DBIC_TRACE the generated SQL is correct
... LEFT JOIN address home_address ON ( home_address.address_type = ? AND home_address.uid = ? ) LEFT JOIN address work_address ON ( work_address.address_type = ? AND work_address.uid = ? ) WHERE ( me.uid = ? ): 'home', 'me.uid', 'work', 'me.uid', '120969'
but $row->home_address is always just undef and I do not understand why.
I have also tried
__PACKAGE__->might_have(
"home_address" => "Schema::Result::Address",
{ 'foreign.uid' => 'self.uid' },
{ where => { 'address_type' => 'home' } , cascade_copy => 0, cascade_delete => 0 },
);
__PACKAGE__->might_have(
"work_address" => "Schema::Result::Address",
{ 'foreign.uid' => 'self.uid' },
{ where => { 'address_type' => 'work' } , cascade_copy => 0, cascade_delete => 0 },
);
but the where part is never a part of the DBIC_TRACE.
Any ideas as to what I'm missing here?
The DBIx::Class docs have an example of a custom relationship with fixed values on the remote side of the rel: https://metacpan.org/pod/DBIx::Class::Relationship::Base#Custom-join-conditions.
The part you've missed is the -ident, so DBIC can distinguish between a fixed value and a related column.
Because of that the query ends up with a bind variable that is passed the literal string 'me.uid' on execution.
My Kibana5.6.8 logstash configuration seems only reading one log file
My logstash.conf on /home/elastichsearch/confLogs is
input {
file {
type => "static"
path => "/home/elasticsearch/static_logs/**/*Web.log*" exclude => "*.zip"
start_position => beginning
sincedb_path => "/dev/null"
}
}
filter {
if [type] == "static" {
if [message] !~ /(.+)/ {
drop { }
}
grok{
patterns_dir => "./patterns"
overwrite => [ "message" ]
# 2017-08-07 11:47:35,466 INFO [http-bio-10.60.2.19-10267-exec-60] jsch.DeployManagerFileUSImpl (DeployManagerFileUSImpl.java:155) - Deconnexion de l'hote qvizzza3
# 2017-08-07 11:47:51,775 ERROR [http-bio-10.60.2.19-10267-exec-54] service.BindingsRSImpl (BindingsRSImpl.java:143) - Can't find bindings file deployed on server
# 2017-08-03 16:01:11,352 WARN [Thread-552] pcf2.AbstractObjetMQDAO (AbstractObjetMQDAO.java:137) - Descripteur de
match => [ "message", "%{TIMESTAMP_ISO8601:logdate},%{INT} %{LOGLEVEL:logLevel} \[(?<threadname>[^\]]+)\] %{JAVACLASS:package} \(%{JAVAFILE:className}:%{INT:line}\) - %{GREEDYDATA:message}" ]
}
# 2017-08-03 16:01:11,352
date{
match => [ "logdate", "YYYY-MM-dd hh:mm:ss" ]
target => "logdate"
}
}
}
output {
elasticsearch { hosts => ["192.168.99.100:9200"]}
My logs directory, with load balanced logrotate files
static_logs
--prd1
----mlog Web.log
----mlog Web.log.1
----mlog Web.log.2
--prd2
----mlog Web.log
----mlog Web.log.2
Where is my mistake ?
My patterns are on /home/elasticsearch/confLogs/patterns/grok-patterns qui with TIMESTAMP_ISO8601
Regards
If my log files are more 140M, logdate filter is not viewing as an date field, but as an string field !!!
I'm using DBIX::Class and generating conditions for search like that:
my #array;
push #array, { condition1 => 'value1' };
push #array, [ { condition2 => 'value2' }, { condition3 => 'value3' } ];
All this conditions must be checked using AND operator, that's why I wrote this:
#array = ( -and => #array );
After running code with such conditions process on my virtual machine started to use up to 8 Gb memory. I thought that it was recursion problems and I didn't mistake. I checked logs and saw records about deep recursion but I couldn't find anything about my case in internet.
Is there problems with assigning list containing array to array itself?
Or maybe it is a problem with DBIX::Class (SQL::Abstract)? Why it causes deep recursion?
Update. This is the real code from project:
sub faq {
my ( $self ) = #_;
my #cond;
if ( $self->param('faq_type') ) {
push #cond,
{
'me.faq_type' => $self->param('faq_type'),
};
}
if ( my $search = $self->param('search') ) {
push #cond,
[
'me.title' => { ilike => "%$search%" },
'me.text' => { ilike => "%$search%" },
];
}
#cond = ( -and => #cond );
my %attr = (
join => 'page_category',
rows => $self->param('limit'),
offset => $self->param('offset'),
order_by => { -desc => 'id' },
result_class => 'BUX::Util::HashRefInflator',
'+select' => [ qw( page_category.name ) ],
'+as' => [ qw( category_name ) ],
);
my #pages = BUX::DB->rs('Page')->search( \#cond, \%attr )->all;
my $total_count = BUX::DB->rs('Page')->count( \#cond );
return $self->render(json => {
pages => \#pages,
count => $total_count
});
}
And log records:
Deep recursion on subroutine "SQL::Abstract::_SWITCH_refkind" at /opt/perlbrew/perls/perl-5.14.4/lib/site_perl/5.14.4/SQL/Abstract.pm line 719.
Deep recursion on subroutine "SQL::Abstract::_recurse_where" at /opt/perlbrew/perls/perl-5.14.4/lib/site_perl/5.14.4/SQL/Abstract.pm line 546.
Deep recursion on subroutine "SQL::Abstract::_where_ARRAYREF" at /opt/perlbrew/perls/perl-5.14.4/lib/site_perl/5.14.4/SQL/Abstract.pm line 687.
Deep recursion on subroutine "SQL::Abstract::_where_HASHREF" at /opt/perlbrew/perls/perl-5.14.4/lib/site_perl/5.14.4/SQL/Abstract.pm line 493.
Deep recursion on subroutine "SQL::Abstract::_where_unary_op" at /opt/perlbrew/perls/perl-5.14.4/lib/site_perl/5.14.4/SQL/Abstract.pm line 596.
Deep recursion on subroutine "SQL::Abstract::_where_op_ANDOR" at /opt/perlbrew/perls/perl-5.14.4/lib/site_perl/5.14.4/SQL/Abstract.pm line 645.
P.S. BUX::DB is the subclass of DBIx::Class and rs is a shortcut for resultset.
When specifying several conditions that should all be met to search
with DBIx::Class, the usual way to do this is by passing a hashref
with the column names as keys and the conditions as values.
While it is possible to instead specify an arrayref of hashrefs with the '-and' keyword, this is most often unnecessary - especially if you only have one condition to specify!
NOTE: I am not certain { -and => #cond } does what you want, have you tried replacing it with { -and => \#cond } ( Note the arrayref) ? This could be the reason why SQL::Abstract gets confused, though I'm unsure how that would end up being a recursion.
SECOND NOTE: I find #cond = ( -and => \#cond ) confusing and it may cause trouble. I would suggest working with a hashref passed into search, as it should be called with, and setting the -and key instead, by adapting my first example.
This is how I would specify the conditions:
my $cond;
if ( my $faq_type = $self->param('faq_type') ){
$cond->{'me.faq_type'} = $faq_type;
}
if ( my $search = $self->param('search') ){
$cond->{-or} = [
{ $cond->{'me.title'} = { ilike => '%$search%' }, },
{ $cond->{'me.text' } = { ilike => '%$search%' }, },
];
}
An alternative to consider, would be to first specify the 'faq_type' search and store the resulting rs, to then refine it further as necessary, this seems more in line with the spirit of DBIx::Class to me:
my $pages_rs = BUX::DB->rs('Page');
if ( my $faq_type = $self->param('faq_type') ){
$pages_rs = $pages_rs->search({ 'me.faq_type' => $faq_type });
}
if ( my $search = $self->param('search') ){
$pages_rs = $pages_rs->search({
-or => [
'me.title' => { ilike => "%$search%" },
'me.text' => { ilike => "%$search%" },
];
});
}
my %attr = (
join => 'page_category',
rows => $self->param('limit'),
offset => $self->param('offset'),
order_by => { -desc => 'id' },
result_class => 'BUX::Util::HashRefInflator',
'+select' => [ qw( page_category.name ) ],
'+as' => [ qw( category_name ) ],
);
$pages_rs = $pages_rs->search( undef, \%attr );
my #pages = $pages_rs->all; # This executes the query
Please keep in mind this is untested as I currently don't have an easy way of verifying this. If this does not help, feel free to comment and I'll try and fix whatever may be off.
EDIT: to not leave something in that is wrong, I've removed the (irrelevant) page count I put in.
I am using mime_content_type() in PHP 5.5 to get a MIME type, but it throws fatal: error function not found.
How can I achieve this on PHP 5.5?
Make use of the finfo() functions.
A simple illustration:
<?php
$finfo = finfo_open(FILEINFO_MIME_TYPE);
echo finfo_file($finfo, "path/to/image_dir/image.gif");
finfo_close($finfo);
OUTPUT :
image/gif
Note : Windows users must include the bundled php_fileinfo.dll DLL file in php.ini to enable this extension.
I've spent too much time trying to get the finfo functions to work, properly. I finally just ended up creating my own function to match the file extension to any array of mime types. It's not a full-proof way of assuring that the files are truly what the extension denotes them to be, but that problem can be mitigated by how you process I/O of said files on your server(s).
function mime_type($file) {
// there's a bug that doesn't properly detect
// the mime type of css files
// https://bugs.php.net/bug.php?id=53035
// so the following is used, instead
// src: http://www.freeformatter.com/mime-types-list.html#mime-types-list
$mime_type = array(
"3dml" => "text/vnd.in3d.3dml",
"3g2" => "video/3gpp2",
"3gp" => "video/3gpp",
"7z" => "application/x-7z-compressed",
"aab" => "application/x-authorware-bin",
"aac" => "audio/x-aac",
"aam" => "application/x-authorware-map",
"aas" => "application/x-authorware-seg",
"abw" => "application/x-abiword",
"ac" => "application/pkix-attr-cert",
"acc" => "application/vnd.americandynamics.acc",
"ace" => "application/x-ace-compressed",
"acu" => "application/vnd.acucobol",
"adp" => "audio/adpcm",
"aep" => "application/vnd.audiograph",
"afp" => "application/vnd.ibm.modcap",
"ahead" => "application/vnd.ahead.space",
"ai" => "application/postscript",
"aif" => "audio/x-aiff",
"air" => "application/vnd.adobe.air-application-installer-package+zip",
"ait" => "application/vnd.dvb.ait",
"ami" => "application/vnd.amiga.ami",
"apk" => "application/vnd.android.package-archive",
"application" => "application/x-ms-application",
// etc...
// truncated due to Stack Overflow's character limit in posts
);
$extension = \strtolower(\pathinfo($file, \PATHINFO_EXTENSION));
if (isset($mime_type[$extension])) {
return $mime_type[$extension];
} else {
throw new \Exception("Unknown file type");
}
}
Edit:
I'd like to address Davuz's comment (since it keeps getting up-voted) and remind everyone that I put in the pseudo disclaimer at the top that this isn't "full-proof." So, please keep that in mind when considering the approach I've offered in my answer.
mime_content_type() is not deprecated and works fine.
Why is mime_content_type() deprecated in PHP?
http://php.net/manual/en/function.mime-content-type.php
As of PHP 5.3, it's even built-in.
$finfo = finfo_open(FILEINFO_MIME_TYPE); should do it.
Taken from the php.net docs. Your function is deprecated and probably already removed.
http://www.php.net/manual/en/function.finfo-file.php
You should understand that file_get_contents will upload whole file to the memory, it is not good way to get only mime type. You don't need to use buffer method and file_get_contents function in this case.
To prevent any errors and warnings, better do like this.
$filename = 'path to your file';
if (class_exists('finfo')) {
$finfo = new finfo(FILEINFO_MIME_TYPE);
if (is_object($finfo)) {
echo $finfo->file($filename);
}
} else {
echo 'fileinfo did not installed';
}
Also you should know $finfo->file will throw PHP Warning if it fail.
If fileinfo is not installed properly, and you have a fresh version of PHP, you can get mime type from headers.
You can use cURL to get mime type from headers.
$ch = curl_init();
curl_setopt_array($ch, array(
CURLOPT_HEADER => true,
CURLOPT_NOBODY => true,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_SSL_VERIFYPEER => false,
CURLOPT_SSL_VERIFYHOST => false,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_MAXREDIRS => 1,
CURLOPT_URL => $link)
);
$headers = curl_exec($ch);
curl_close($ch);
if (preg_match('/Content-Type:\s(.*)/i', $headers, $matches)) {
echo trim($matches[1], "\t\n\r");
}else {
echo 'There is no content type in the headers!';
}
Also you can use get_headers function, but it more slow than cURL request.
$url = 'http://www.example.com';
$headers = get_headers($url, 1);
echo $headers['Content-Type'];
Get the image size using:
$infFil=getimagesize($the_file_name);
and
echo $infFil["mime"]
The getimagesize returns an associative array which have a MIME key and obviously the image size too
I used it and it works
I use the MimeTypeTool from Bat (https://github.com/lingtalfi/Bat)
It uses fileinfo if available, and defaults back to an "extension => mime type" mapping otherwise.
This is the best solution I found by combining two very good posts
// Thanks to http://php.net/manual/en/function.mime-content-type.php#87856
function getMimeContentType($filename, $ext)
{
if(!function_exists('mime_content_type'))
{
if($mime_types = getMimeTypes())
{
if (array_key_exists($ext, $mime_types))
{
return $mime_types[$ext];
}
elseif (function_exists('finfo_open'))
{
$finfo = finfo_open(FILEINFO_MIME);
$mimetype = finfo_file($finfo, $filename);
finfo_close($finfo);
return $mimetype;
}
}
return 'application/octet-stream';
}
return mime_content_type($filename);
}
// Thanks to http://php.net/manual/en/function.mime-content-type.php#107798
function getMimeTypes()
{
$url = 'http://svn.apache.org/repos/asf/httpd/httpd/trunk/docs/conf/mime.types';
$mimes = array();
foreach(#explode("\n",#file_get_contents($url)) as $x)
{
if(isset($x[0]) && $x[0]!=='#' && preg_match_all('#([^\s]+)#', $x, $out) && isset($out[1]) && ($c = count($out[1])) > 1)
{
for($i=1; $i < $c; $i++)
{
$mimes[$out[1][$i]] = $out[1][0];
}
}
}
return (#sort($mimes)) ? $mimes : false;
}
Use it link this:
$filename = '/path/to/the/file.pdf';
$ext = strtolower(array_pop(explode('.',$filename)));
$content_type = getMimeContentType($filename, $ext);
Will continue to work even if the mime_content_type function is no longer supported in php.