I want to do bulk operations based on terms instead of node id. For example, I have a term called "XXXX", I have selected this term for 10 contents. So, in manage views I want to display only the term name, if when user selecting that term and do publish/unpublish, that 10 contents should be publish/unpublish.
Is it possible with Rules? Can anyone please guide me?
Related
I've an installation with TYPO3 8 LTS and there are custom tca entries which are listed with pagination but the records don't have a detail page so I cant able to configure the extension configuration. My client requirement is these records should be indexed with Solr. For eg. a record x may exists in the third page and if we search x in solr the result should be gone to this extension page with page argument 3. Please help me to find a way to resolve this issue.
AFAIK that is not possible and wouldn't make sense. A record can be listed in several places and several plugins --> one record, indexed once (canonical). Search result listings with belonging information are not stored in the database, so to find out, on which page a record would appear is hard to find out --> it depends e.g. on the sorting, on the entered search phrase, on the results count per pagination page, etc.
That also makes sense from a users point of view: he / she / it searches for a specific phrase and want to see a result, not another list of results.
You should implement a detail page for the records. As a workaround to get closer to your requirements, you can also add another parameter the the listing (id of record) to just show the relevant record in the listing (filter the list by id) to show the listing, but only with the relevant record(s).
The url is just another field in the solr document. So if you don't fill, just don't output it in the template? You can e.g. check the link field in the result or make a condition based on the type of record and just show no link.
I have data that can only expose to people with entitlement, the data has a meta field call "system", there are hundreds of systems, we have authorized user pre-defined in a database, how can I design the system to allow the authorized user be granted only the data that he or she is entitled to? e.g. if Adam is from Equity team, he can view the search result from system 1-10 and not the rest; if Amily is from Audit team, she can view search result from all systems 1-200.
Thank you very much.
For small-ish number of systems (i.e. where the count doesn't exceed the maximum number of boolean terms in a query), adding a fq=system:(1 OR 2 OR 3 OR 4 OR 5 ... OR 10) is the easiest way to limit the result set - this assumes that the user is allowed to access all fields in the document. Your external system can provide a list of system ids that the user is allowed to access, and you then apply the fq for every request to Solr.
It's important that this is done on every request, since a user can otherwise get exposed to data they should not have access to through facets and other metainformation.
Bloomberg has a nice presentation about how they attacked this problem in one of their 2014 Lucene/Solr Revolution talks - Efficient Scalable Search in a Multi-Tenant Environment - where they went as far down as implementing different access rights for parts of the values in a field.
I think best solution for your query, you can created documents and for each document you can add multivalued field we called it as userAllow.
Whenever you query for solr pass userid in filter query in below way and solr returns only those documents who having rights rights for those users.
if Adam userid is 1 then,
q:(Your Query)
fq:SolrAllow:1
Result set contain only those records having rights for userid 1.
you can also implement it using this way.
I have a community website with 20.000 members using everyday a search form to find other members. The results are sorted by the last connected members. I'd like to use Solr for my search (right now it's mysql) but I'd like to know first if it's good practice to update the document of every member who would login in order to change their login date and time ? There will be around 20.000 update of documents a day, I don't really know if it's too much updating and could alter performances ? Tank you for your help.
20k updates/day is not unreasonable at all for Solr.
OTOH, for very frequently updating fields (imagine one user could log in multiple times a day so you might want to update it all those times), you can use External Fields to keep that field stored outside the index (in a text file) and still use it for sorting in solr.
Generally, Solr does not be used for this purpose, using your database is still better.
However, if you want to use Solr, you will deal with it in a way like the database. i.e every user document should has a unique field, id for example. When the user make log in, you may use an update query for that user's document last_login_date field by its id. You could able to know more about Partial Update from this link.
I have implemented a search engine in Java. It has a database that stores the inverted index ie mapping from terms to list of documents the term appears in. There is a feature that allows a user to upload a document which can be added to document for indexing. The problem that i'm facing is that , everytime a new document is added , the index is reconstructed in memory instead of being updated . To update , i would need a database that stores document vectors that are essentially tf-idf's(term frequency* inverse document frequency) of each and every term in the index. I'm not able to work out database structure for it as in what rows and columns or multiple tables would be needed for storing such a structure.
I need to store
1. Document ID
2. Document Title
3. N dimensional Document vector where N is the number of unique terms
4. N terms
5. IDF of each term
6. TF of each term for every document.
I need it so that at the time of query matching i can extract this vector and calculate its similarity with the query vector.If you want any additional information, please let me know.
Thank you very much , I'm sure i would get some help here.
Are you sure you wanna use a Database to implement a search engine?
You may take a look at this Java framework which does an excellent job and very simple to learn .
Lucene Tutorial in 5 mins
It uses the Vector Space Model and there's no need for you to worry about all the above fields you mentioned in your post, since Lucene stores them along with much more advanced ranking factors.
I am sorry that my reply doesn't help you if you are intentionally using the Databases.
I have a location auto-complete field which has auto complete for all countries, cities, neighborhoods, villages, zip codes. This is part of a location tracking feature I am building for my website. So you can imagine this list will be in the multi-millions of rows. Expecting over 20 million atleast with all the villages and potal codes. To make the auto-complete work well I will use memcached so we dont hit the database always to get this list. It will be used a lot as this is the primary feature on the site. But the question is:
Is only 1 instace of the list stored in memcached irrespective of the users pulling the info or does it need to maintain a separate instance for each? So if say 20 million people are using it at the same time, will that differ from just 1 person using the location auto-complete? I am open to other ideas also on how to implement this location auto complete so it performs well.
Or can i do something like this: When a user logs in in the background I send them the list anyways, so by the time they reach the auto complete textfield their computer will have it ready to load instant?
Take a look at Solr (or Lucene itself), using NGram (or EdgeNGram) tokenizers you can get good autocomplete performance on massive datasets.