Auto Deletion (TTL) in Cloudant - cloudant

I have been looking up ways to setup expiration / auto deletion of a document in Cloudant. According to Cloudant documentation TTL is not available, Am I mistaken in this regard? If not, what are the best alternatives to auto deleting a record in Cloudant or should I look into some other noSQL alternative?

You are not mistaken, Cloudant doesn't have a TTL feature, you could implement your own using a view and an external service that checks it.
So if you only wanted to be sure the document was deleted on the day that it expired, you could have a service that ran once a day which would read the view for documents expiring on that day, and issue delete requests as required.

Related

Cloudant CDTDatastore to pull only part of the database

We're using Cloudant as the remote database for our app. The database contains documents for each user of the app. When the app launches, we need to query the database for all the documents belonging to a user. What we found is the CDTDatastore API only allows pulling the entire database and storing it inside the app then performing the query in the local copy. The initial pulling to the local datastore takes about 10 seconds and I imagine will take longer when adding more users.
Is there a way I can save only part of the remote database to the local datastore? Or, are we using the wrong service for our app?
You can use a server side replication filter function; you'll need to add information about your filter to the pull replicator. However replication will have a performance hit when using the function.
That being said a common pattern is to use one database per user, however this has other trade offs and it is something you should read up on. There is some information on the one database per user pattern here.

How do I keep a database of repo stats and user stats, in sync with github?

How do I keep a database of repo stats and user stats, in sync with github?
I know that I can use
curl https://api.github.com/repos/reggi/handwritten
and
curl https://api.github.com/users/reggi
To get repo and user data.
But what's the best way to keep that in sync with my database?
I don't believe theres a webhook for this general data, like when a stargazer gets added.
Should I just have a cron script that updates the database daily?
I don't believe theres a webhook for this general data, like when a stargazer gets added.
There is one event related to stargazing, and strangely enough, it is the Watch Event.
See "Upcoming Changes to Watcher and Star APIs".
More generally, using WebHook is a good way to be kept in sync with a lot of events regarding a repo or a user account.

How can I delete old backup via cron?

I'm using GAE/J and taking backup via cron.
https://developers.google.com/appengine/articles/scheduled_backups
Taking backup is possible by API (/_ah/datastore_admin/backup.create), but I could not find a way to delete backup by API.
I already tried "backup.delete", but it was not worked.
Does someone know the way to delete old backup via cron?
Thank you.
If you go to the Datastore Admin page, select some backups, and hit delete you will see that the page it takes you to has a form on it with hidden fields. That form POSTs the backup IDs (as separate fields with the same name "backup_id") to this URL: /_ah/datastore_admin/backup_delete.do
The backup_id's are the keys of _AE_Backup_Information entities in the datastore.
I have not tried using it yet. The form uses a XSRF token.

Does ID always increase when adding new objects to GAE Datastore?

I'm building an client/server-app where I want to sync data. I'm thinking about including the largest key from the local client database in the query so the server can fetch all entities added after that entity (with key > largest_local_key).
Can I be sure that the Google App Engine always increase the ID of new entities?
Is that a good way to implement synchronization?
No, IDs do not increase monotonically.
Consider synchronizing based on a create/update timestamp.

Pulling facebook and twitter status updates into a SQL database via Coldfusion Page

I'd like to set up a coldfusion page that will pull the status updates from my own facebook account and twitter accounts and put them in a SQL database along with their timestamps. Whenever I run this page it should only grab information after the most recent time stamp it already has within the database.
I'm hoping this won't be too bad because all I'm interested in is just status updates and their time stamps. Eventually I'd like to pull other things like images and such, but for a first test just status updates is fine. Does anyone have sample code and/or pointers that could assist me in this endeavor?
I'd like it if any information relates to the current version of the apis (twitter with oAuth and facebook open graph) if they are necessary. Some solutions I've seen involve the creation of a twitter application and facebook application to interact with the APIs; is that necessary if all I want to do is access a subset of my own account information? Thanks in advance!
I would read the max(insertDate) from the database and if the API allows you, only request updates since that date. Then insert those updates. The next time you run you'll just need to get the max() of the last bunch of updates before calling for the next bunch.
You could run it every 5 minutes using a ColdFusion scheduled task.
How you communicate with the API is usually using <cfhttp />. One thing I always do is log every request and response, either in a text file, or in a database. That's can be invaluable when troubleshooting.
Hope that helps.
Use the cffeed tag to pull RSS feeds from Twitter and Facebook. Retain the date of the last feed scan somewhere (application variable or database) and loop over the feed entries. Any entry older than last scan is ignored, everything else gets committed. Make sure to wrap cffeed in a try/catch, as it will throw errors if the service is down (ahem, twitter) As mentioned in other answers, set it up as a scheduled task.
<cffeed action="read" properties="feedMetadata" query="feedQuery"
source="http://search.twitter.com/search.atom?q=+from:mytwitteraccount" />
Different approach than what you're suggesting, but it worked for us. We had two live events, where we asked people to post to a bespoke Facebook fan page, or to Twitter with a hashtag we endorsed for the event in realtime. Then we just fetched and parsed the RSS feeds of the FB page, and the Twitter search results, extracting what was new, on a short interval... I think it was approximately every three minutes. CFFEED was a little error-prone and wonky, just doing a CFHTTP get of the RSS feeds, and then processing the CFHTTP.filecontent struct item as XML worked fine
.LAG

Resources