I want to create some custom search logic.
I found the logic quite custom so I dont see how this can implemented by extending SOLR.
More specifically, I want client to use the id to perform a search to find similar items of the same category. But the returned results need to be filtered with some very custom logic.
For that reason, I think I want to implement some custom service that will expose a REST API to the client and then it will forward the request to SOLR search.
Do you think that I can avoid this option by extending SOLR search implementation?
Which is best practice?
The best practice is to have a layer between Solr and the client anyway. Solr does not have security out of the box and anybody who can access it can issue delete commands as well as the search one.
So, doing a REST interface to the client and talking to Solr via a secure link (firewall/IP protected) is the good practice.
Related
So currently in the project we have a collection of documents that don't require authentication to be read. They are write/update protected, but everyone can read.
What we are trying to prevent is that someone looks at the firebase endpoints and somehow manages to scrape the entire collection in json format (if this is even possible). The data is public, but I want it only to be accessible from our website.
One of the solutions we could think of was SSR (we are already using Next.js), but implementing SSR just for this reason doesn't seem very enticing.
Any suggestions would be appreciated.
EDIT:
Let me rephrase a little bit.
From what you see in the network tab, is it possible to forge/create a request to Firestore and get the entire collection instead of just the 1 document that was intended?
The best solution in your case is SSR. I know, it could sound as not enticing, but well, let's reason on when we should use SSR, then. In your use case, there is an important requirement: security. I think this is already a strong enough reason to justify the usage of SSR.
Also, creating an ad hoc service account for the next.js app, and securing the data with custom rules that allow the read of your data only to that service account, would only improve the overall security level.
Last: reading the data server side should make your site work a little faster, even if it would be difficult to notice, because we are talking about milliseconds. Notice that your page, as it is now, will need to be loaded, before the request to Firebase could be sent. This is adding a small delay. If the data is loaded server side, the delay is not added.
is it possible to forge/create a request to Firestore and get the entire collection instead of just the 1 document that was intended?
If you want to limit what people can request from a collection, you're looking for security rules. The most common model there is some form of ownership-based access control or role-based access control, but both of those require some way of identifying the user. This could be anonymously (so without them entering credentials), but it'd still be a form of auth.
If you don't want to do that, you can still control how much data can be gotten through the API in one go. For example, if you in the security rules allow get but not list, the user can only request a document once they know its ID. Even if you allow list, you can control in rules what queries are allowed.
I think one approach could be writing a Cloud Function that retrieves this public data using the admin SDK. Then, you could set a rule that nobody can read those documents. This means that only your Cloud Function with the admin SDK will have access to those documents.
Finally, you could set up AppCheck for that specific Cloud Function, this way, you ensure that the request is coming from your client app only.
https://firebase.google.com/docs/app-check
I'm trying to make a simple React app that uses Cloud Firestore for user auth and storing data; something that I could serve using heroku or something like that.
I'm running into trouble with enabling a user to delete their account (and associated data), as Firestore tells me that it's a bad idea to delete collections from the client side. Here's what they say:
Deleting a collection requires coordinating an unbounded number of individual delete requests. If you need to delete entire collections, do so only from a trusted server environment. While it is possible to delete a collection from a mobile/web client, doing so has negative security and performance implications.https://firebase.google.com/docs/firestore/manage-data/delete-data
While I might be able to delete the document connected with the user's account, this suggests that I can't really delete the sub-collections under that document.
So what would be a good way of automatically removing both the user document and user sub-collections? Can I achieve this through my react code? If not, is there a relatively easy way to do it without building a fancy back end?
Well, the documentation never says that this is not possible, just not recommended.
This makes sense if you consider that if you want to delete everything, including documents from subcollection, you would have to create a logic that will do this one by one, which is a very read intensive process and all of this data processing happening in your webapp is not a good practice, it might slow your app down or even block functionality while this process is occuring, etc.
What I would recommend for you is to follow the recommendations of Firestore itself, which is to create a Callable Function that will perform this actions for you, all processing will be done by that function and this will not degrade your app performance, you can find more details in this documentation.
I am feeding Azure Search with data from multi-tenant database, so every document in the index has a property TenantId. For searching, aggregations, suggestions I always filter by "TenantId eq 'xxx'" depending on the user calling it.
However for autocomplete it is not possible to filter, so if it returns "something", the tenant in context might not have "something" in his data. Any way to overcome this?
This feature is actively being developed and will be completed before the Autocomplete API reaches General Availability. I'll update this thread once we deploy the change so you can try it.
Can someone please suggest how to implement a search feature in the application built using angularjs,nodejs and mongodb this feature should be like when a user enters letter a then all the book names which is starting with a from the database should be displayed in the drop down (eg: tags drop down below in stack overflow)
Any suggestion and help?
You can have a web service running such that a REST api call from angular is made to your web service whenever some one presses a letter in the search box.
The web service code should handle querying the database and sending the results.
You can use a cache to make it faster
I need to add full-text search capabilities to my existing database. Of course first turn is to something like Solr or Elastic Search. And the blocking point I’ve got to is – how to securely display results returned from underlying search engine (let’s think about Solr or Elastic Search for now, however any other solution or engine that hit the point are also appreciated).
The tricky context is that I have, for example, in my system Personal Profile records that are to be indexed. One of the fields in personal profile is – manager’s feedback. Normally in the system that field is visible only to employee’s direct manager and higher hierarchy, i.e. ‘manager’ from another branch will not be able to see that field. However, I want that field to be searchable via full text search but only for people who actually can see it.
Now I query Solr for ‘stupid’ (that is query string) and it returns me N documents. When returning that to end-user I’ll remove the ‘Manager’s feedback’ field because end-user is not the manager of given people – but just presence of the document in resultset is already the evidence of ‘stupid’ guys …
The question is – what is workable approach to handle that use-case? Is it possible to plug into Solr/ES with home-grown security filter for outputs?
Caveats:
filtering out only fields do not work because of above mentioned scenario
filtering out complete documents will not work because of
search engine does not tell which fields matched – therefore no way to manually filter resultset by field http://elasticsearch-users.115913.n3.nabble.com/Best-way-to-return-which-field-matched-td2713071.html
even this does work, removing documents from result set will spoil down facets (e.g. number of matches by department) returned by the engine – I’ll have to either recalculate facets manually or they will not match to manually filtered records and will reveal what I actually do not want to show to end users
In Solr you can create multiValued fields. In your case you can use it to store de-normalized values of organization structure.
In described scenario you will create multi valued field ouId (Organization Unit Id) and store employee's ouId and all parent ouIds. In other words you will save allowed ouIds into this field.
In search scenario you will use FilterQuery - fq parameter filtering by ouId of manager.
Example:
..&fq=ouId:12
where 12 is organization unit id of selected manager.
Maybe this is helpful for you https://github.com/salyh/elasticsearch-security-plugin It adds Document level security to elasticsearch.
"Currently for user based authentication and authorization Kerberos/SPNEGO and NTLM are supported through 3rd party library waffle (only on windows servers). For UNIX servers Kerberos/SPNEGO is supported through tomcat build in SPNEGO Valve (Works with any Kerberos implementation. For authorization either Active Directory and generic LDAP is supported). PKI/SSL client certificate authentication is also supported (CLIENT-CERT method). SSL/TLS is also supported without client authentication.
You can use this plugin also without Kerberos/NTLM/PKI but then only host based authentication is available.
As of now two security modules are implemented:
Actionpathfilter: Restrict actions against Elasticsearch on a coarse-grained level like who is allowed to to READ, WRITE or even ADMIN rest api calls
Document level security (dls): Restrict actions on document level like who is allowed to query for which fields within a document"