Is this possible? I'd like to normalize some data I'm storing in CouchDB, to take in one JSON object and, through designdocs, create multiple documents with different pieces of the data.
An example would be posting data about a book. I'd like to create a document for the book, check and see if there's any information about a publisher and if we have a document for that publisher, and if we don't, create a document for the publisher as well.
Does CouchDB have any functionality that would accomplish this? I know I could split up the data on the client, but I'd rather this logic be more centralized.
You can post multiple docs at once with the bulk docs endpoint, but it doesn't contain any logic like you're describing. That must be done on the client.
Related
I am looking at ways to store user data alongside transactional data like orders and invoices. Normally, I would use a relational database like postgresql, but I wanted to know if it would be a good idea to store the user data along with their transactional data in one noSQL table like DynamoDB?
I would assume if you did that you would structure your data to either use objects or arrays to store the orders or invoices but I'm not sure if that is the best was to go about it.
EDIT
So after doing some more research and trying understand how to fit everything into a single table design I found this article in the AWS documentation. I decided to organise my data into collections using a combinaton of the primary key and the sort key. The sort key is used to determine collections (i.e., orders, customer-data, etc). This solution is perfect for my use case because I can keep all the user data (including transactions like orders) in one dynamodb table.
In short, don't do that. DynamoDB is a great tool, but you need to understand it first. It's not just a no-sql, it's also a distributed one. It gives great performance, scalability and pricing. But modeling is trickier. You can not build requests as you please, those has to be taken into consideration when you design your model. Read about queries vs scans and global vs local indexes. When you get that you might try reading about Single Table Design. It should give you an idea about the limitations of the DynamoDB.
I’m working on an application that is using a relational database Mysql for most of the entities and it is constructed in a microservices architecture and each service is using a separate MYSQL database.
Now I’m trying to implement a search engine for publications using elasticsearch as a middleware funnel to be able to search by all entities that are related to the publication object even from different services with different databases.
What will be the best way to index the publication object?
I have 3 options in mind:
Create a full publication type with multi-nested object types?
problems:
Duplication of all the entities from different services
Hard to update for example in case of updating an instruction
Create a different publication type with all fields from other objects then normalize the data when inserting or finding the data.
problems:
normalizing data on inserting and on finding is costly
hard to maintain and to update
Insert multiple separated types similar to the relational database then do multiple queries to find the final object, for example, if we want to find a publication by user_name we have to find the user first then use the user_id to find the publications.
problems:
we have to make more than 1 query to get valid results
Use has_parent, has_child relation but in this case the child is publication and it is having multiple many-to-one relations so multiple parents.
I could be going in the wrong direction please share your feedback if you think I should use a different technology
not sure if i understand right.. but if u just want people to search the publications entity, then creating a view with the fields needed would be my first solution.
can u elaborate more why exactly u thought of elasticsearch ?
second:
i dont think you should worry about duplications when it comes to elasticsearch.
We have a Cloudant database on Bluemix that contains a large number of documents that are answer units built by the Document Conversion service. These answer units are used to populate a Solr Retrieve and Rank collection for our application. The Cloudant database serves as our system of record for the answer units.
For reasons that are unimportant, our Cloudant database is no longer valid. What we need is a way to download everything from the Solr collection and re-create the Cloudant database. Can anyone tell me a way to do that?
I'm not aware of any automated way to do this.
You'll need to fetch all your documents from Solr (and assuming you have a lot of them, do this in a paginated way - there are some examples of how to do this in the Solr doc) and add them into Cloudant.
Note that you'll only be able to do this for the fields that you have set to be stored in your schema. If there are important fields that you need in Cloudant that you haven't got stored in Solr, then you might be stuck. :(
You can replicate one Cloudant database to another which will create you an exact replica.
Another technique is to use a tool such as couchbackup which takes a copy of your database's documents (ignoring any deletions) and allows you to save the data in a text file. You can then use the couchrestore tool to upload the data file to a new database.
See this blog for more details.
I provide a form for users to upload their own data. I use ajax-form-submit and then parse the data to create numerous models (one per row in uploaded csv).
Now, I want to create models into a predefined collection.
I can use add which takes an array of models but unfortunately, it does not send PUSH at server side. I know I can iterate and create .create for each model but let's say I have 10k models, it would create 10k calls. Sounds unreasonable. Did I miss anything?
The other way is to accept multiple models at server and use .ajax calls and then add manually to the collection for UI rendering.
Looking for the best route. Thanks.
Backbone and REST simply do not cover all real-world use cases such as your bulk create example. Nor do they have an official pattern for bulk delete, which is also extremely common. I am baffled as to why they refuse to address these extremely common use cases, but in any case, you're left to your own good judgement here. So I would suggest adding a bulkSave or import method to your collection. That should send an AJAX POST request with your CSV form data to the server, the server should save the info and if all goes well, return a JSON array of the newly-created models. You collection should take that JSON array in the POST response and pass it to reset (and parse as well if you need special parsing).
Definitely don't do a POST request for each model (row in your CSV), especially if you plan on having 10K models. However, to be clear, it wouldn't be completely terrible to do that pattern for a few dozen models if your UI shows real-time progress and error handling on a per-record basis (23 of 65 saved, for example).
I like the pragmatic approach of #PeterLyons but another idea could be trying to transform your not REST functionality to a REST functionality.
What you want is to create a bunch of Models at once. REST doesn't allow create multiple resources at one. What REST likes is to create one resource at a time.
No problem, we create a new resource call Bulk with its own url and its own POST verb. The attributes of this Model are the array of Models you want to create.
With this approach you can also solve future functionalities like modify and remove multiple Models at once.
Now you just need to figure out how to associate the array of Models to this new Model and how to make the Bulk.toJSON method responses properly.
I'm building a web application that will essentially allow 'admins' to create forms with any number and combination of form elements (checkboxes, combo-boxes, text-fields, date-fields, radio-groups, etc). 'Users' will log into this application and complete the forms that admins create.
We're using MySQL for our database. Has anyone implemented an application with this type of functionality? My first thoughts are to serialize the form schema has JSON and store this as a field in one of my database tables, and then serialize the submissions from different users and also store this in a mysql table. Another thought: is this something that a no-sql database such as MongoDB would be suited for?
Yes, a document-oriented database such as MongoDB, CouchDB, or Solr could do this. Each instance of a form or a form response could have a potentially different set of fields. You can also index fields, and they'll help you query for documents if they contain that respective field.
Another solution I've seen for implementing this in an SQL database is the one described in How FriendFeed uses MySQL to store schema-less data.
Basically like your idea for storing semi-structured data in serialized JSON, but then also create another table for each distinct form field you want to index. Then you can do quick indexed lookups and join back to the row where the serialized JSON data includes that field/value pair you're looking for.