import old data from postgres to elasticsearch - database

I have a lot of data in my postgres database( on a remote). This is the data of the past 1 year, and I want to push it to elasticsearch now.
The data has a time field in it in this format 2016-09-07 19:26:36.817039+00.
I want this to be the timefield(#timestamp) in elasticsearch. So that I can view it in kibana, and see some visualizations over the last year.
I need help on how do I push all this data efficiently. I cannot get that how do I get all this data from postgres.
I know we can inject data via jdbc plugin, but I think I cannot create my #timestamp field with that.
I also know about zombodb but not sure if that also gives me feature to give my own timefield.
Also, the data is in bulk, so I am looking for an efficient solution
I need help on how I can do this. So, suggestions are welcome.

I know we can inject data via jdbc plugin, but I think I cannot create
my #timestamp field with that.
This should be doable with Logstash. The first starting point should probably be this blog post. And remember that Logstash always consists of 3 parts:
Input: JDBC input. If you only need to import once, skip the schedule otherwise set the right timing in cron syntax.
Filter: This one is not part of the blog post. You will need to use the Date filter to set the right #timestamp value — adding an example at the end.
Output: This is simply the Elasticsearch output.
This will depend on the format and field name of the timestamp value in PostgreSQL, but the filter part should look something like this:
date {
match => ["your_date_field", "dd-mm-YYYY HH:mm:ss"]
remove_field => "your_date_field" # Remove now redundant field, since we're storing it in #timestamp (the default target of date)
}
If you're concerned with the performance:
You will need to set the right jdbc_fetch_size.
Elasticsearch output is batched by default.

Related

Convert FileTime to DateTime in Azure Logic App

I'm pretty new to Logic App so still learning my way around custom expressions. One thing I cannot seem to figure out is how to convert a FileTime value to a DateTime value.
FileTime value example: 133197984000000000
I don't have a desired output format as long as Logic App can understand that this is a DateTime value and can be able to run before/after date logic.
To achieve your requirement, I have converted the Windows file Time to Unix File Time then converted to File time by add them as seconds to a default date 1970-01-01T00:00:00Z. Here is the Official documentation that I followed. Below is the expression that worked for me.
addSeconds('1970-01-01T00:00:00Z', div(sub(133197984000000000,116444736000000000),10000000))
Results:
This isn't likely to float your boat but the Advanced Data Operations connector can do it for you.
The unfortunate piece of the puzzle is that (at this stage) it doesn't just work as is but be rest assured that this functionality is coming.
Meaning, you need to do some trickery if you want to use it to do what you want.
By this I mean, if you use the Xml to Json operation, you can use the built in functions that come with the conversion to do it for you.
This is an example of what I mean ...
You can see that I have constructed some XML that is then passed into the Data parameter. That XML contains your Windows file time value.
I have then setup the Map Object to then take that value and use the built in ado function FromWindowsFileTime to convert it to a date time value.
The Primary Loop at Element is the XPath query that will make the selection to return the relevant values to loop over.
The result is this ...
Disclaimer: I should point out, this is due to drop in preview sometime in the middle of Jan 2023.
They have another operation in development that will allow you to do this a lot easier but for now, this is your easier and cheapest option.
This kind of thing is also available in the Transform and Expert operations but that's the next tier level of pricing.

How to write an odata filter in a logic app where field has date time and I just want to use date portion

There is a record in the Salesforce Stay Information which has the following information. "Booked_Check_in_Date_Time__c": "2022-11-05T00:59:00Z"
When I try the following oData filter it does not work.
Booked_Check_in_Date_Time__c eq 2022-11-05
What do I need to change to bring back this record.
This is because you are not using the complete query to filter the results.
What do I need to change to bring back this record.
In your case you need to use the below query format to get the filtered results.
Booked_Check_in_Date_Time__c eq `2022-11-05T00:59:00Z`
RESULTS:
It looked like I needed to use Local_Booked_Checkin_Date__c to achieve what I needed. If I used Booked_Check_in_Date_Time__c I would have to add time to the filter which I did not want to.

Hacked SQL Server database need regex

A database that a client of mine has was hacked. I am in the process of trying to rebuild the data. The site is running classic ASP with a SQL Server database. I believe I have found where the weak point was for the hackers and removed that entry point for now.
Every text colummn in the database was appended with some html markup and inline script/js tags.
Here is an example of a field:
all</title><script>
document.write("<style>.aq21{position:absolute;clip:rect(436px,auto,auto,436px);}</style>");
</script>
<div class=aq21>
<a href=http://samedaypaydayloansonlineelqmt.com >same day payday loans online</a>
<a href=http://samedaypaydayloan
This example was in the Users table in the UserRights column. The initial value was all, but then you can see the links that were appended.
I need to write a regex script that will search through all fields in each column of each table in the database and remove this extra markup.
Essentially, if I try to match </table>, then that string and everything that appends it can be replaced with a blank string.
All of these appended strings are the same for each field in the same column. However, there are multiple columns in each table.
This is what I have been doing so far, replacing the hacked part, but a nice regex would probably help me out, though my regex skills.... well suck.
UPDATE [databasename.[db].[databasetable]
set
UserRights = replace(UserRights,'</title><script>document.write("<style>.aq21{position:absolute;clip:rect(436px,auto,auto,436px);}</style>");</script><div class=aq21><a href=http://samedaypaydayloansonlineelqmt.com >same day payday loans online</a><a href=http://samedaypaydayloan','');
Any regex help and/or tips are appreciated.
This is what I ended up doing (big thanks to #Bohemian):
I went through each table and checked which column was affected. Then I ran the following script on each column:
UPDATE [tablename]
set columnname = substring(columnname, 1, charindex('/', columnname)-1)
where columnname like '%</%';
If the column had any markup in it, then I ended up manually updating those records manually. (lucky for me there was only a couple of records).
If anyone has any better solutions, please feel free to comment.
Thanks!
Since the bad stuff starts with a <, and that is an unusual character to typically find, I would use normal text functions, something like this:
update mytable set
mycol = substr(mycol, 1, charindex('<', mycol) - 1)
where mycol like '%<%';
And methodically do this with every column of every table.
Note that I'm only guessing at the right function to use, since I'm unfamiliar with SQL Server, but you get idea.
I welcome someone editing the SQL to improve it.

store one date and two time fields

I need to store schedule date and times. Scheduale contains one date field and two time fields.
Is there any possibility to store schedule in one db field and not in two (datetime + datetime)?
I am using SQL Server 2005.
Thanks!
Whether it is "start"+"stop", or "start"+""duration", you have 2 pieces of information = store 2 pieces of information.
Using a string or XML makes no sense: this requires take more space, more processing, more code to search and use.
Why would you want to store what are effectively two datetimes in one field rather than two? Are there no cases where the schedule might have times that cross days? (ie. 01/03/2011 23:59, 02/03/2011 01:35)? Do you not mind having to parse out the information rather than having it immediately ready for query?
If you really want to, there's no reason you can't store it as a string type, comma separated possibly, maybe XML as suggested, but I can't say it's recommended as date/time fields are more space efficient, nice and fast/flexible for searching purposes, and there are many useful T-SQL functions which can easily be used on date/time types which you'd be hard pushed to use on a string without some parsing and casting/converting.
If you can come up with a good reason for not using two datetime fields, I'll have another Donut! (ps. happy Fat Thursday).
One quick, and horribly evil thought ... you could use part of the datetime to store the "difference" ... sneak it into the "seconds" and "milliseconds" values, and apply it to the main date/time to get the new value. A bit hacky, but it'd could do the job, depending on your range requirements.
-- Example: 01/03/2011 12:30:02
-- Translates into - first of March 2011, 12:30 to 14:30 (12:30 + (seconds * hours))
set #ModifiedDatetime =
DATEADD(hour, DATEPART(second, #originalDateTime), #originalDateTime);
Beware of rounding errors with milliseconds ... and please think about the consequences of what you're doing. God kills a kitten each time someone abuses a type :)
You can try using the XML field type and store an XML snippet in there, similar to the following:
<schedule date="2011-01-01" fromTime="12:00" toTime="14:00" />
You can then use XQuery in a select to transform the result set back to a "normal" row-based result set. A sample query implementing XQuery, based on my example's XML schema, could be as follows:
SELECT
[...]
, Schedule.value('(/schedule/#date)[1]','datetime') as [Date]
, Schedule.value('(/schedule/#fromTime)[1]','char(5)') as [FromTime]
, Schedule.value('(/schedule/#toTime)[1]','char(5)') as [ToTime]
FROM [TABLE]
I'm not saying that storing it as XML is the best way to do it (as the other answers rightfully state), but you asked IF it is possible and I propose a solution...

SQL Server String Manipulation for URLs?

I need to append a paramter-value 'xval=9' to all non-blank SQL server column values in a multi-million row table. The column contains URLs and they have a random amount of "querystring" parameters appended to the column. So when I append, I may need to append '?xval=9' or I may need to append '&val=9', depending on if parameters already exist.
So the URL values could like like any of these:
http://example.com/example
http://example.com/example/?aval=1
http://example.com/example/?aval=1&bval=2
http://example.com/example/index.html?aval=1&bval=2
'aval' and 'bval' are just samples, really any kind of key/value pair might be on the end of the URL.
What is the smartest pure-TSQL way to manipulate that, hopefully utilizing some kind of indexing?
Thanks.
Do that on Presentation or Model layer, not on Data layer.
i.e. read all data and manipulate using C# or other language you use.
Maybe this should work
SELECT CASE CHARINDEX('?', Url) WHEN 0 THEN Url+'?foo=boo' ELSE Url+'&foo=boo' END AS Url FROM Whatever

Resources