How to split the custom logs and add custom field name in each values on logstash - arrays

I want to split the custom logs
"2016-05-11 02:38:00.617,userTestId,Key-string-test113321,UID-123,10079,0,30096,128,3"
that log means
Timestamp, String userId, String setlkey, String uniqueId, long providerId, String itemCode1, String itemCode2, String itemCode3, String serviceType
I try to made a filter using ruby
filter {
ruby{
code => "
fieldArray = event['message'].split(',')
for field in fieldArray
result = field
event[field[0]] = result
end
"
}
}
but I don't have idea how to split the logs with adding field names each custom values as belows.
Timestamp : 2016-05-11 02:38:00.617
userId : userTestId
setlkey : Key-string-test113321
uniqueId : UID-123
providerId : 10079
itemCode1 : 0
itemCode2 : 30096
itemCode3 : 128
serviceType : 3
How can I do?
Thanks regards.

You can use the grok filter instead. The grok filter parse the line with a regex and you can associate each group to a field.
It is possible to parse your log with this pattern :
grok {
match => {
"message" => [
"%{TIMESTAMP_ISO8601:timestamp},%{USERNAME:userId},%{USERNAME:setlkey},%{USERNAME:uniqueId},%{NUMBER:providerId},%{NUMBER:itemCode1},%{NUMBER:itemCode2},%{NUMBER:itemCode3},%{NUMBER:serviceType}"
]
}
}
This will create the fields you wish to have.
Reference: grok patterns on github
To test : Grok constructor
Another solution :
You can use the csv filter, which is even more closer to your needs (but I went with grok filter first since I have more experience with it): Csv filter documentation
The CSV filter takes an event field containing CSV data, parses it, and stores it as individual fields (can optionally specify the names). This filter can also parse data with any separator, not just commas.
I have never used it, but it should look like this :
csv {
columns => [ "Timestamp", "userId", "setlkey", "uniqueId", "providerId", "itemCode1", "itemCode2 "itemCode3", "serviceType" ]
}
By default, the filter is on the message field, with the "," separator, so there no need to configure them.
I think that the csv filter solution is better.

Related

Parsing an array of JSON objects from logfile in Logstash

I have logs in the following type of format:
2021-10-12 14:41:23,903716 [{"Name":"A","Dimen":[{"Name":"in","Value":"348"},{"Name":"ses","Value":"asfju"}]},{"Name":"read","A":[{"Name":"ins","Value":"348"},{"Name":"ses","Value":"asf5u"}]}]
2021-10-12 14:41:23,903716 [{"Name":"B","Dimen":[{"Name":"in","Value":"348"},{"Name":"ses","Value":"a7hju"}]},{"Name":"read","B":[{"Name":"ins","Value":"348"},{"Name":"ses","Value":"ashju"}]}]
Each log on a new line. Problem is I want each object from the single line in the top level array to be a separate document and parsed accordingly.
I need to parse this and send it to Elasticsearch. I have tried a number of filters, grok, JSON, split etc and I cannot get it to work the way I need to and I have little experience with these filters so if anyone can help it would be much appreciated.
The JSON codec is what I would need if I can remove the Text/timestamp from the file.
"If the data being sent is a JSON array at its root multiple events will be created (one per element)"
If there is a way to do that, this would also be helpful
This is the config example for your usecase:
input { stdin {} }
filter {
grok {
match => { "message" => "%{DATA:date},%{DATA:some_field} %{GREEDYDATA:json_message}" }
}
#Use the json plugin to translate raw to json
json { source => "json_message" target => "json" }
#and split the result to dedicated raws
split { field => "json" }
}
output {
stdout {
codec => rubydebug
}
}
If you need to parse the start of the log as date, you can use the grok with the date format or connect two fields and set than as source to the date plugin.

Return first n element from array in elasticsearch query

I have an array field in document named as IP which contains above 10000 ips as element.
for e.g.
IP:["192.168.a:A","192.168.a:B","192.168.a:C","192.168.A:b"...........]
Now i made a search query with some filter and i got the results but the problem is size of result very huge because of above field.
Now I want to fetch only N ips from array let say only 10 order doesn't matters.
So How do i do that...
update:
Apart from IP field there are others fields also and i applied filter on that field not on IP .I want whole document which satisfies filters .I just want to limit the number of elements in single IP fields.(Let me know if there is any other way apart from using script also ).
This kind of request could solve your problem :
GET ips/_search
{
"query": {
"match_all": {}
},
"script_fields": {
"truncate_ip": {
"script": {
"source": """
String[] trunc_ip = new String[10];
for (int i = 0; i < 10; ++i) {
trunc_ip[i]= params['_source']['IP'][i];
}
return trunc_ip;
"""
}
}
}
}
You can use scriptedFields for generating a new field from existing fields in Elastic Search. Details added as comments.
GET indexName/_search
{
"_source": {
"excludes": "ips" //<======= Exclude from source the IP field (change the name based on your document)
},
"query": {
"match_all": {} // <========== Define relevant filters
},
"script_fields": {
"limited_ips": { // <========= add a new scipted field
"script": {
"source": "params['_source'].ips.stream().limit(2).collect(Collectors.toList())" // <==== Replace 2 with the number of i.ps you want in result.
}
}
}
}
Note:
If you remove _source then only the scripted field will be the part of the result.
Apart from accessing the value of the field, the rest of the syntax is Java. Change as it suits you.
Apart from non-analyzed text fields, use doc['fieldName'] to access the field with-in script. It is faster. See the below excerpt from E.S docs :
By far the fastest most efficient way to access a field value from a
script is to use the doc['field_name'] syntax, which retrieves the
field value from doc values. Doc values are a columnar field value
store, enabled by default on all fields except for analyzed text
fields
By default ES returns only 10 matching results so I am not sure what is your search query and what exactly you want to restrict
no of elements in single ip field
No of ip fields matching your search results
Please clarify above and provide your search query to help further.

How can I filter results by custom field in Solr query?

I need some custom field filter for my Solr data like
{
"id":"1",
"name":"Test title",
"language":"en"
},
{
"id":"2",
"name":"Test title",
"language":"fr"
"parent": "1"
}
I need to get just first item by query
/select?q=name:test
So I need to filter results by parent field in such way, that one of the items will be present in the result.
Thanks for any ideas.
When I need to do querys in Solr I used the SearchQuery() and inside them I set filterQueries. There was possible set filters for my search.
final String FIELD_NAME = "name_text_mv"; // name of my field in Solr
SearchQuery searchQuery = init(facetSearchConfig); // init configs
searchQuery.setFreeTextQueryBuilder(text); // set the text of my search
setFiltersFreeTextSearch(searchQuery.getFilterQueries(), text, FIELD_NAME);
The function to make the magic (add in my search my filters):
private void setFiltersFreeTextSearch(List<QueryField> filters, String text, String... fields) {
text = StringUtils.stripAccents(text).toLowerCase();
String textCapitalized = capitalizeEachWolrd(text.toLowerCase());
for (String field : fields) {
QueryField queryField = new QueryField(field, SearchQuery.Operator.OR, SearchQuery.QueryOperator.CONTAINS,
text, text.toUpperCase(), textCapitalized);
filters.add(queryField);
}
}
How you can see, in this QueryField you can add the 'wheres' of you search in Solr. I was using CONTAINS and that is my 'LIKE' and 'OR' for find any item.
So basicly you can use QueryField() to add filters for you specifically field.
Well, this was my solution for my case, anyway, this is just an idea. :)
(For the projet is used Java)

Turning Array Into String in SnapLogic

I have the output of a SalesForce SOQL snap that is a JSON in this format.
[
{
"QualifiedApiName": "Accelerator_Pack__c"
},
{
"QualifiedApiName": "Access_Certifications__c"
},
{
"QualifiedApiName": "Access_Requests__c"
},
{
"QualifiedApiName": "Account_Cleansed__c"
},
{
"QualifiedApiName": "Account_Contract_Status__c"
}
]
I am attempting to take those values and turn them into a string with the values separated by commas, like this, so that I can use that in the SELECT clause of another query.
Accelerator_Pack__c, Access_Certifications__c, Access_Requests__c, Account_Cleansed__c, Account_Contract_Status__c
From the documentation, my understanding was that .toString() would convert the array into a comma-separated string, but as shown in the attached image, it isn't doing anything. Does anyone have experience with this?
You need to aggregate the incoming documents.
Use the Aggregate snap with the function CONCAT. This will give you a | delimited concatenated string as the output like as follows.
Accelerator_Pack__c|Access_Certifications__c|Access_Requests__c|Account_Cleansed__c|Account_Contract_Status__c
You can then replace the | with , like $concatenated_fields.split('|').join(',') or $concatenated_fields.replace(/\|/g, ',').
Following is a detailed explanation of the configuration.
Sample Pipeline:
Sample Input:
I set the sample JSON you provided in a JSON Generator for testing.
Aggregation:
Result of Aggregation:
You get a | delimited concatenated string.
Mapper Expression:
Output:
Both expressions give the same result.
You can also use the array functions directly to achieve this. see the below pipeline that can be used to concat the values:
I have used the JSONGenerator for taking your sample data as input.
Then I have used the GroupByN snap with '0' as the group size to formulate the array.
Finally in the mapper you can use the below expression to concat:
jsonPath($, "$arrayAccom[*].QualifiedApiName").join(",")

TextCriteria with diacriticSensitive. Spring Data MongoDB 1.10.9

I'm trying to create a search in my MongoDB database to search the "name" field without taking into account the accents
I need to create an index in the field:
// create index
#Indexed
#Field("nombre")
private String nombre;
Check in the BBDD that it is created correctly:
db.empleado_bk.getIndexes();
{
"v" : 2,
"key" : {
"nombre" : 1
},
"name" : "nombre",
"ns" : "elser2.empleado_bk"
}
I modify my repository to search in text without taking accents into account
if (StringUtils.isNoneBlank(dtoFilter.getNombre())) {
query.addCriteria(TextCriteria.forDefaultLanguage().diacriticSensitive(true).matching("nombre"));
}
But when looking for that field, I get the following error:
org.springframework.data.mongodb.UncategorizedMongoDbException: Query failed with error code 27 and error message 'text index required for $text query'
Can someone tell me if I'm doing something wrong if I need to do something else
You applied a standard index with #Indexed. To apply a text search index, you need to use #TextIndexed.

Resources