How to search in array with LIKE operator - arrays

id | name | ipAddress
----+----------+-------------------------
1 | testname | {192.168.1.60,192.168.1.65}
I want to search ipAddress with LIKE. I tried:
{'$mac_ip_addresses.ip_address$': { [OP.contains]: [searchItem]}},
This one also:
{'$mac_ip_addresses.ip_address$': { [OP.Like] : { [OP.any]: [searchItem]}}},
The data type of ipAddress is text[]. I want to search in ipAddress with LIKE.
searchItem contains the IP that need to be searched in the ipAddress field so I want to search in array with LIKE.

I don't know Sequelize but I can answer from postgres side.
There is no short syntax to search for a pattern inside array in PostgreSQL.
If you want to check pattern for each array element individually, then you need to unfold the array using unnest:
SELECT id, name, ipaddress
FROM testing
WHERE EXISTS (
SELECT 1 FROM unnest(ipaddress) AS ip
WHERE ip LIKE '8.8.8.%'
);
If the array is frequently searched this way, it's better to store the data in normalized form.
However, there is a short syntax (plus GIN index support) for for equality based search (see #> and other operators here).
SELECT id, name, ipaddress
FROM testing
WHERE ipaddress #> ARRAY['8.8.8.8'];

What you asked
~~ is the operator used internally to implementing SQL LIKE. There is no commutator for it - no operator that works with left and right operand switched.
That's the one you'd need for your attempt to use the ANY construct with the pattern to the left. Related:
You can create the operator, though, and it's pretty simple:
CREATE OR REPLACE FUNCTION reverse_like (text, text)
RETURNS boolean LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
'SELECT $2 LIKE $1';
CREATE OPERATOR <~~ (function = reverse_like, leftarg = text, rightarg = text);
Inspired by Jeff Janes' idea here:
Match string pattern to any array element
Then your query can have the pattern to the left of the operator:
SELECT *
FROM mac_ip_addresses
WHERE '192.168.2%.255' <~~ ANY (ipaddress);
Simple, but considerably slower than the EXISTS expression demonstrated by filiprem.
Then again, either query is excruciatingly slow for big tables, since neither can use an index. A normalized DB design with a n:1 table holding one IP each would allow that. It would also occupy several times the space on disk. Still, the much cleaner implementation ...
While stuck with your current design, there is still a way: create a trigram GIN index on a text representation of the array and add a redundant, "sargable" predicate to the query additionally. Confused? Here's the recipe:
First, trigram indexes? Read this if you are not familiar:
PostgreSQL LIKE query performance variations
Neither the cast from text[] to text nor array_to_string() are immutable. But we need that for an expression index. Long story short, fake it with an immutable wrapper function:
CREATE OR REPLACE FUNCTION f_textarr2text(text[])
RETURNS text LANGUAGE sql IMMUTABLE AS $$SELECT array_to_string($1, ',')$$;
CREATE INDEX iparr_trigram_idx ON iparr
USING gin (f_textarr2text(iparr) gin_trgm_ops);
Related answer with the long story (and why it's safe):
Indexing an array for full text search
Then your query can be:
SELECT *
FROM mac_ip_addresses
WHERE NOT ('192.168.9%.255' <~~ ANY (ipaddress))
AND f_textarr2text(ipaddress) LIKE '192.168.9%.255'; -- logically redundant
The added predicate is logically redundant, but can tap into the power of the trigram index.
Much faster for big tables. Still a bit faster, yet:
SELECT *
FROM mac_ip_addresses
WHERE EXISTS (SELECT FROM unnest(ipaddress) ip WHERE ip LIKE '192.168.9%.255')
AND f_textarr2text(ipaddress) LIKE '192.168.9%.255';
But that's minor now.
db<>fiddle here
I addressed the question asked, as I took an interest. Might be of interest to the general public. Most probably not what you need, though.
What you need
I want to search in ipAddress with LIKE. searchItem contains the IP that need to be searched in the ipAddress field so I want to search in array with LIKE.
That should probably read:
"I want to search a given IP address (searchItem) in the array ipAddress. My first idea is to use LIKE ..."
Well, LIKE is for pattern matching. To find complete IP addresses in an array, it's the wrong tool. filiprem's second query with array operators is the way to go. Probably good enough.
Using the built-in data type cidr instead of text would be better. And the ip4 data type of the additional ip4r module would be much better, yet. All in combination with standard array operators like demonstrated.
Finally, converting IPv4 addresses to integer and using that with the additional inrarray module should be stellar - as far as performance is concerned.

Related

LIKE query on elements of flat jsonb array

I have a Postgres table posts with a column of type jsonb which is basically a flat array of tags.
What i need to do is to somehow run a LIKE query on that tags column elements so that i can find a posts which has a tags beginning with some partial string.
Is such thing possible in Postgres? I'm constantly finding super complex examples and no one is ever describing such basic and simple scenario.
My current code works fine for checking if there are posts having specific tags:
select * from posts where tags #> '"TAG"'
and I'm looking for a way of running something among the lines of
select * from posts where tags #> '"%TAG%"'
SELECT *
FROM posts p
WHERE EXISTS (
SELECT FROM jsonb_array_elements_text(p.tags) tag
WHERE tag LIKE '%TAG%'
);
Related, with explanation:
Search a JSON array for an object containing a value matching a pattern
Or simpler with the #? operator since Postgres 12 implemented SQL/JSON:
SELECT *
-- optional to show the matching item:
-- , jsonb_path_query_first(tags, '$[*] ? (# like_regex "^ tag" flag "i")')
FROM posts
WHERE tags #? '$[*] ? (# like_regex "TAG")';
The operator #? is just a wrapper around the function jsonb_path_exists(). So this is equivalent:
...
WHERE jsonb_path_exists(tags, '$[*] ? (# like_regex "TAG")');
Neither has index support. (May be added for the #? operator later, but not there in pg 13, yet). So those queries are slow for big tables. A normalized design, like Laurenz already suggested would be superior - with a trigram index:
PostgreSQL LIKE query performance variations
For just prefix matching (LIKE 'TAG%', no leading wildcard), you could make it work with a full text index:
CREATE INDEX posts_tags_fts_gin_idx ON posts USING GIN (to_tsvector('simple', tags));
And a matching query:
SELECT *
FROM posts p
WHERE to_tsvector('simple', tags) ## 'TAG:*'::tsquery
Or use the english dictionary instead of simple (or whatever fits your case) if you want stemming for natural English language.
to_tsvector(json(b)) requires Postgres 10 or later.
Related:
Get partial match from GIN indexed TSVECTOR column
Pattern matching with LIKE, SIMILAR TO or regular expressions in PostgreSQL

MS Access, use query name as field default value

My department uses a software tool that can use a custom component library sourced from Tables or Queries in an MS Access database.
Table: Components
ID: AutoNumber
Type: String
Mfg: String
P/N: String
...
Query: Resistors
SELECT Components.*
FROM Components
WHERE Components.Type = "Resistors"
Query: Capacitors
SELECT Components.*
FROM Components
WHERE Components.Type = "Capacitors"
These queries work fine for SELECT. But when users add a row to the query, how can I ensure the correct value is saved to the Type field?
Edit #2:
Nope, can't be done. Sorry.
Edit #1:
As was pointed out, I may have misunderstood the question. It's not a wonky question after all, but perhaps an easy one?
If you're asking how to add records to your table while making sure that, for example, "the record shows up in a Resistors query if it's a Resistor", then it's a regular append query, that specifies Resisitors as your Type.
For example:
INSERT INTO Components ( ID, Type, Mfg )
SELECT 123, 'Resistors', 'Company XYZ'
If you've already tried that and are having problems, it could be because you are using a Reserved Word as a field name which, although it may work sometimes, can cause problems in unexpected ways.
Type is a word that Access, SQL and VBA all use for a specific purpose. It's the same idea as if you used SELECT and FROM as field or table names. (SELECT SELECT FROM FROM).
Here is a list of reserved words that should generally be avoided. (I realize it's labelled Access 2007 but the list is very similar, and it's surprisingly difficult to find an recent 'official' list for Excel VBA.)
Original Answer:
That's kind a a wonky way to do things. The point of databases is to organize in such a way as to prevent duplication of not only data, but queries and codes as well
I made up the programming rule for my own use "If you're doing anything more than once, you're doing it wrong." (That's not true in all cases but a general rule of thumb nonetheless.)
Are the only options "Resistors" and "Capacitors"? (...I hope you're not tracking the inventory of an electronics supply store...) If there are may options, that's even more reason to find an alternative method.
To answer your question, in the Query Design window, it is not possible to return the name of the open query.
Some alternative options:
As #Erik suggested, constrain to a control on a form. Perhaps have a drop-down or option buttons which the user can select the relevant type. Then your query would look like:
SELECT * FROM Components WHERE Type = 'Forms![YourFormName]![NameOfYourControl]'
In VBA, have the query refer to the value of a variable, foe example:
Dim TypeToDel as String
TypeToDel = "Resistor"
DoCmd.RunSQL "SELECT * FROM Components WHERE Type = '" & typeToDel'"
Not recommended, but you could have the user manually enter the criteria. If your query is like this:
SELECT * FROM Components WHERE Type = '[Enter the component type]'
...then each time the query is run, it will prompt:
Similarly, you could have the query prompt for an option, perhaps a single-digit or a code, and have the query choose the the appropriate criteria:
...and have an IF statement in the query criteria.
SELECT *
FROM Components
WHERE Type = IIf([Enter 1 for Resistors, 2 for Capacitors, 3 for sharks with frickin' laser beams attached to their heads]=1,'Resistors',IIf([Enter 1 for Resistors, 2 for Capacitors, 3 for sharks with frickin' laser beams attached to their heads]=2,'Capacitors','LaserSharks'));
Note that if you're going to have more than 2 options, you'll need to have the parameter box more than once, and they must be spelled identically.
Lastly, if you're still going to take the route of a separate query for each component type, as long as you're making separate queries anyway, why not just put a static value in each one (just like your example):
SELECT * FROM Components WHERE Type = 'Resistor'
There's another wonky answer here but that's just creating even more duplicate information (and more future mistakes).
Side note: Type is a reserved word in Access & VBA; you might be best to choose another. (I usually prefix with a related letter like cType.)
More Information:
Use parameters in queries, forms, and reports
Use parameters to ask for input when running a query
Microsoft Access Tips & Tricks: Parameter Queries
 • Frickin' Lasers

App engine - easy text search

I was hoping to implement an easy, but effective text search for App Engine that I could use until official text search capabilities for app engine are released. I see there are libraries out there, but its always a hassle to install something new. I'm wondering if this is a valid strategy:
1) Break each property that needs to be text-searchable into a set(list) of text fragments
2) Save record with these lists added
3) When searching, just use equality filters on the list properties
For example, if I had a record:
{
firstName="Jon";
lastName="Doe";
}
I could save a property like this:
{
firstName="Jon";
lastName="Doe";
// not case sensative:
firstNameSearchable=["j","o", "n","jo","on","jon"];
lastNameSerachable=["D","o","e","do","oe","doe"];
}
Then to search, I could do this and expect it to return the above record:
//pseudo-code:
SELECT person
WHERE firstNameSearchable=="jo" AND
lastNameSearchable=="oe"
Is this how text searches are implemented? How do you keep the index from getting out of control, especially if you have a paragraph or something? Is there some other compression strategy that is usually used? I suppose if I just want something simple, this might work, but its nice to know the problems that I might run into.
Update:::
Ok, so it turns out this concept is probably legitimate. This blog post also refers to it: http://googleappengine.blogspot.com/2010/04/making-your-app-searchable-using-self.html
Note: the source code in the blog post above does not work with the current version of Lucene. I installed the older version (2.9.3) as a quick fix since google is supposed to come out with their own text search for app engine soon enough anyway.
The solution suggested in the response below is a nice quick fix, but due to big table's limitations, only works if you are querying on one field because you can only use non-equality operators on one property in a query:
db.GqlQuery("SELECT * FROM MyModel WHERE prop >= :1 AND prop < :2", "abc", u"abc" + u"\ufffd")
If you want to query on more than one property, you can save indexes for each property. In my case, I'm using this for some auto-suggest functionality on small text fields, not actually searching for word and phrase matches in a document (you can use the blog post's implementation above for this). It turns out this is pretty simple and I don't really need a library for it. Also, I anticipate that if someone is searching for "Larry" they'll start by typing "La..." as opposed to starting in the middle of the word: "arry". So if the property is for a person's name or something similar, the index only has the substrings starting with the first letter, so the index for "Larry" would just be {"l", "la", "lar", "larr", "larry"}
I did something different for data like phone numbers, where you may want to search for one starting from the beginning or middle digits. In this case, I just stored the entire set of substrings starting with strings of length 3, so the phone number "123-456-7890" would be: {"123","234", "345", ..... "123456789", "234567890", "1234567890"}, a total of (10*((10+1)/2))-(10+9) = 41 indexes... actually what I did was a little more complex in order to remove some unlikely to-be-used substrings, but you get the idea.
Then your query would be:
(Pseaudo Code)
SELECT * from Person WHERE
firstNameSearchIndex == "lar"
phonenumberSearchIndex == "1234"
The way that app engine works is that if the query substrings match any of the substrings in the property, then that is counted as a match.
In practice, this won't scale. For a string of n characters, you need n factorial index entries. A 500 character string would need 1.2 * 10^1134 indexes to capture all possible substrings. You will die of old age before your entity finishes writing to the datastore.
Implementations like search.SearchableModel create one index entry per word, which is a bit more realistic. You can't search for arbitrary substrings, but there is a trick that lets you match prefixes:
From the docs:
db.GqlQuery("SELECT * FROM MyModel
WHERE prop >= :1 AND prop < :2",
"abc", u"abc" + u"\ufffd")
This matches every MyModel entity with
a string property prop that begins
with the characters abc. The unicode
string u"\ufffd" represents the
largest possible Unicode character.
When the property values are sorted in
an index, the values that fall in this
range are all of the values that begin
with the given prefix.

GQL query with "like" operator [duplicate]

Simple one really. In SQL, if I want to search a text field for a couple of characters, I can do:
SELECT blah FROM blah WHERE blah LIKE '%text%'
The documentation for App Engine makes no mention of how to achieve this, but surely it's a common enough problem?
BigTable, which is the database back end for App Engine, will scale to millions of records. Due to this, App Engine will not allow you to do any query that will result in a table scan, as performance would be dreadful for a well populated table.
In other words, every query must use an index. This is why you can only do =, > and < queries. (In fact you can also do != but the API does this using a a combination of > and < queries.) This is also why the development environment monitors all the queries you do and automatically adds any missing indexes to your index.yaml file.
There is no way to index for a LIKE query so it's simply not available.
Have a watch of this Google IO session for a much better and more detailed explanation of this.
i'm facing the same problem, but i found something on google app engine pages:
Tip: Query filters do not have an explicit way to match just part of a string value, but you can fake a prefix match using inequality filters:
db.GqlQuery("SELECT * FROM MyModel WHERE prop >= :1 AND prop < :2",
"abc",
u"abc" + u"\ufffd")
This matches every MyModel entity with a string property prop that begins with the characters abc. The unicode string u"\ufffd" represents the largest possible Unicode character. When the property values are sorted in an index, the values that fall in this range are all of the values that begin with the given prefix.
http://code.google.com/appengine/docs/python/datastore/queriesandindexes.html
maybe this could do the trick ;)
Altough App Engine does not support LIKE queries, have a look at the properties ListProperty and StringListProperty. When an equality test is done on these properties, the test will actually be applied on all list members, e.g., list_property = value tests if the value appears anywhere in the list.
Sometimes this feature might be used as a workaround to the lack of LIKE queries. For instance, it makes it possible to do simple text search, as described on this post.
You need to use search service to perform full text search queries similar to SQL LIKE.
Gaelyk provides domain specific language to perform more user friendly search queries. For example following snippet will find first ten books sorted from the latest ones with title containing fern
and the genre exactly matching thriller:
def documents = search.search {
select all from books
sort desc by published, SearchApiLimits.MINIMUM_DATE_VALUE
where title =~ 'fern'
and genre = 'thriller'
limit 10
}
Like is written as Groovy's match operator =~.
It supports functions such as distance(geopoint(lat, lon), location) as well.
App engine launched a general-purpose full text search service in version 1.7.0 that supports the datastore.
Details in the announcement.
More information on how to use this: https://cloud.google.com/appengine/training/fts_intro/lesson2
Have a look at Objectify here , it is like a Datastore access API. There is a FAQ with this question specifically, here is the answer
How do I do a like query (LIKE "foo%")
You can do something like a startWith, or endWith if you reverse the order when stored and searched. You do a range query with the starting value you want, and a value just above the one you want.
String start = "foo";
... = ofy.query(MyEntity.class).filter("field >=", start).filter("field <", start + "\uFFFD");
Just follow here:
init.py#354">http://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/ext/search/init.py#354
It works!
class Article(search.SearchableModel):
text = db.TextProperty()
...
article = Article(text=...)
article.save()
To search the full text index, use the SearchableModel.all() method to get an
instance of SearchableModel.Query, which subclasses db.Query. Use its search()
method to provide a search query, in addition to any other filters or sort
orders, e.g.:
query = article.all().search('a search query').filter(...).order(...)
I tested this with GAE Datastore low-level Java API. Me and works perfectly
Query q = new Query(Directorio.class.getSimpleName());
Filter filterNombreGreater = new FilterPredicate("nombre", FilterOperator.GREATER_THAN_OR_EQUAL, query);
Filter filterNombreLess = new FilterPredicate("nombre", FilterOperator.LESS_THAN, query+"\uFFFD");
Filter filterNombre = CompositeFilterOperator.and(filterNombreGreater, filterNombreLess);
q.setFilter(filter);
In general, even though this is an old post, a way to produce a 'LIKE' or 'ILIKE' is to gather all results from a '>=' query, then loop results in python (or Java) for elements containing what you're looking for.
Let's say you want to filter users given a q='luigi'
users = []
qry = self.user_model.query(ndb.OR(self.user_model.name >= q.lower(),self.user_model.email >= q.lower(),self.user_model.username >= q.lower()))
for _qry in qry:
if q.lower() in _qry.name.lower() or q.lower() in _qry.email.lower() or q.lower() in _qry.username.lower():
users.append(_qry)
It is not possible to do a LIKE search on datastore app engine, how ever creating an Arraylist would do the trick if you need to search a word in a string.
#Index
public ArrayList<String> searchName;
and then to search in the index using objectify.
List<Profiles> list1 = ofy().load().type(Profiles.class).filter("searchName =",search).list();
and this will give you a list with all the items that contain the world you did on the search
If the LIKE '%text%' always compares to a word or a few (think permutations) and your data changes slowly (slowly means that it's not prohibitively expensive - both price-wise and performance-wise - to create and updates indexes) then Relation Index Entity (RIE) may be the answer.
Yes, you will have to build additional datastore entity and populate it appropriately. Yes, there are some constraints that you will have to play around (one is 5000 limit on the length of list property in GAE datastore). But the resulting searches are lightning fast.
For details see my RIE with Java and Ojbectify and RIE with Python posts.
"Like" is often uses as a poor-man's substitute for text search. For text search, it is possible to use Whoosh-AppEngine.

Searching for and matching elements across arrays

I have two tables.
In one table there are two columns, one has the ID and the other the abstracts of a document about 300-500 words long. There are about 500 rows.
The other table has only one column and >18000 rows. Each cell of that column contains a distinct acronym such as NGF, EPO, TPO etc.
I am interested in a script that will scan each abstract of the table 1 and identify one or more of the acronyms present in it, which are also present in table 2.
Finally the program will create a separate table where the first column contains the content of the first column of the table 1 (i.e. ID) and the acronyms found in the document associated with that ID.
Can some one with expertise in Python, Perl or any other scripting language help?
It seems to me that you are trying to join the two tables where the acronym appears in the abstract. ie (pseudo SQL):
SELECT acronym.id, document.id
FROM acronym, document
WHERE acronym.value IN explode(documents.abstract)
Given the desired semantics you can use the most straight forward approach:
acronyms = ['ABC', ...]
documents = [(0, "Document zeros discusses the value of ABC in the context of..."), ...]
joins = []
for id, abstract in documents:
for word in abstract.split():
try:
index = acronyms.index(word)
joins.append((id, index))
except ValueError:
pass # word not an acronym
This is a straightforward implementation; however, it has n cubed running time as acronyms.index performs a linear search (of our largest array, no less). We can improve the algorithm by first building a hash index of the acronyms:
acronyms = ['ABC', ...]
documents = [(0, "Document zeros discusses the value of ABC in the context of..."), ...]
index = dict((acronym, idx) for idx, acronym in enumberate(acronyms))
joins = []
for id, abstract in documents:
for word in abstract.split():
try
joins.append((id, index[word]))
except KeyError:
pass # word not an acronym
Of course, you might want to consider using an actual database. That way you won't have to implement your joins by hand.
Thanks a lot for the quick response.
I assume the pseudo SQL solution is for MYSQL etc. However it did not work in Microsoft ACCESS.
the second and the third are for Python I assume. Can I feed acronym and document as input files?
babru
It didn't work in Access because tables are accessed differently (e.g. acronym.[id])

Resources