I'm validating a standard "Phone" field. I'm new to regexp. I want an expression which throws a validation error if all input values are Characters i.e. Text only.
Text values are acceptable in combination with Numbers & special characters but Text values alone should not be acceptable.
Please guide
used REGEX in Validation rule on Phone field but couldn't get desired result
Related
I have a field containing short texts (a few tokens). I index it as Text rather than String because I need to search within the text.
However, I need to search with the String-style (matching the entire field).
For example, if a field is Google Search Engine. I currently find the row by searching "search engine". While preserving this behavior, I need another option to catch the row only if the search term is "google search engine".
I believe it is possible by regex, but it should be slow.
I wonder if there is a standard way to do so or if I need to add another field of the same content but with the String type.
Use multiple fields - the definition of the second field will differ based on whether you want the search to be case sensitive or not. If you're OK with having a case sensitive field (i.e. "Google" and "google" are different terms), then string is the correct choice.
If you want the field to be case insensitive, use a TextField with a KeywordTokenizer (which keeps the input as a single, large token) with a LowercaseFilter attached (which lowercases the content).
You can then search both fields by using qf - query fields - with the edismax/dismax query parses and score them differently. If you only need explicit searching (you choose whether you want to match the whole string or just words in it yourself), using the field name in the regular way would work.
Use a copyField instruction to index the same content into both fields without changing your indexing pipeline. You'll need to reindex your core / collection for the new field to get any values.
And no, you can't do this with a regex, since the regex is applied against the tokens. You already have the tokens split up into smaller parts, so /foo bar/ doesn't have a foo bar token to match against, just foo and bar - neither match the regex.
Hello there i'm trying to get the list of the imported field Movie_name
How ever the hyphen was escaped and not showing like this img example
And this the data i attached already
Like you see two data with the Movie_name
"Movie_name":"sci-Fi2"},
"Movie_name":"sci-Fi"}]
What i'm trying to do is a simple analytics & get all the list of names with the field Movie_name and not the data.
So the question why the hyphen are escaped in the schema-browser
Why i cannot get the exactly correct field name ???
It's not being escaped - what you're seeing in the schema browser are the actual terms stored in the index (what is usually referred to as "tokens"). If you want these to be preserved in the original form (i.e. as a single token) to be used for faceting or analytics, store them as the type string instead of as a text based field (which have a tokenizer attached - and that tokenizer usually splits the string into multiple smaller tokens on natural split points, such as -).
In your example, sci-fi is turned into sci and fi. If you use a string type, or a KeywordTokenizer, the input is kept as it is, and the token is stored as sci-fi instead.
I am working on Solr 4+.
I have several fields into my solr schema with different solr field types.
Does the search on text field and string field differs?
Because I am trying to search on string field (which is a copy field of few facet fields) which does not work as expected. The destination string field is indexed and stored both.
However, when I change destination field which a text field (only indexed), it works fine.
Can you suggest why this happens? What is exactly the difference between text and string fields in solr in respect to searches?
TextFields usually have a tokenizer and text analysis attached, meaning that the indexed content is broken into separate tokens where there is no need for an exact match - each word / token can be matched separately to decide if the whole document should be included in the response.
StrFields cannot have any tokenization or analysis / filters applied, and will only give results for exact matches. If you need a StrField with analysis or filters applied, you can implement this using a TextField and a KeywordTokenizer.
A general text field that has reasonable, generic cross-language defaults: it tokenizes with StandardTokenizer, removes stop words from case-insensitive "stopwords.txt" (empty by default), and down cases. At query time only, it also applies synonyms.
The StrField type is not analyzed, but indexed/stored verbatim.
How can I change this query to find only records with numeric value of telephoneNumber attribute?
(&(objectClass=user)(telephoneNumber=*)(MemberOf=CN=Users,OU=Groups,DC=domain,DC=local))"
I have to be sure that this field contains only digits.
You can't do that with an LDAP filter.
You may however be able to constrain the attribute, so that non-numerical never get in there in the first place.
In apache Solr why do we always need to prefer string field over text field if both solves purposes?
How string or text affects the parameters like index size, index read, index creation?
The fields as default defined in the solr schema are vastly different.
String stores a word/sentence as an exact string without performing tokenization etc. Commonly useful for storing exact matches, e.g, for facetting.
Text typically performs tokenization, and secondary processing (such as lower-casing etc.). Useful for all scenarios when we want to match part of a sentence.
If the following sample, "This is a sample sentence", is indexed to both fields we must search for exactly the text This is a sample sentence to get a hit from the string field, while it may suffice to search for sample (or even samples with stemmning enabled) to get a hit from the text field.
Adding to Johans Sjöbergs good answer:
You can sort a String but not a Text.