How can I export all results from Blazegraph into a file?

How can I export all results from Blazegraph into a file? - export

I would like to export the results of my SPARQL query from Blazegraph into a file. However, it exports only the first page of the results. When I try to display all results, my browser crashes.
How can I fix this?
I'm running Blazegraph 2.1.2 on a local cluster.

To export results you can rely on curl and query your SPARQL endpoint through command line like this:
curl -X POST http://localhost:9999/bigdata/namespace/YOUR_NAMESPACE/sparql --data-urlencode 'query=SELECT * WHERE{ ?s p ?o } LIMIT 1' --data-urlencode 'format=json' > outputfile
You have to specify your endpoint's address of course and your query as you want. This is just an example but it may give you an idea.
Also you can modify your expected output format (CSV, XML, JSON, etc) and include headers if you want.
Here you can read more about it.

If you want to download all your graph you should use a CONSTRUCT query:
curl --X POST \
--url 'https://{host}/bigdata/namespace/{namespace}/sparql' \
--data-urlencode 'query=CONSTRUCT { ?s ?p ?o } where { ?s ?p ?o }' \
--header 'Accept: application/x-turtle' > outputfile.ttl
In this case I am exporting it in a turtle format.

Related

Document is not returned when searched using query parameter in solr

I updated a document in solr using the below query and it was successful.The document has other fields like organization_name,place etc apart from what is shown in the api below.
curl -X POST -H 'Content-Type: application/json' 'http://localhost:8390/solr/collection/update?commit=true' -d '[{"id" : “12345”,”code” : {"set" : “500”}}]’
{
"responseHeader":{
"rf":1,
"status":0,
"QTime":11}}
Post update, when I tried to query solr with query parameter(q) as name, it does not return any document. At the same time, if I query using name as fq parameter, I see the document coming up fine.
This query doesnot work:-
http://localhost:8390/solr/collection/select?q=test&qf=content^0.1%20name_display_name^1.0&defType=edismax
But,this query works(with fq param),
http://localhost:8390/solr/collection/select?q=*%3A*&rows=1000&fq=ngram_info_organization_name:test
The field type of organization_name is string and its both indexed and stored.
This issue is seen only for the document that i updated. If I query for other documents which are not updated, i am able to see the results.
Please help to figure out why the document is not listed when I use the name with query parameter.

Apache solr fuzzy search on list values

Enviornment - solr-8.9.0
To implement fuzzy search on column "name" (fuzzy search for 'alaistiar~') of csv file in apache solr i am issueing following query
http://localhost:8983/solr/bigboxstore/select?indent=on&q=name:'alaistiar~'&wt=json
To implement fuzzy search on column "name" (fuzzy search for 'shanka~') of csv file in apache solr
http://localhost:8983/solr/bigboxstore/select?indent=on&q=name:'shanka~'&wt=json
May i combine both the above query in a single and find out the documents?
My first http request is doing fuzzy search for value alaistiar~ on name colums and giving some score value and second http request is for shanka~. When i combine both with 'OR' operator Will it behave same as they are individual request.Acutally My purpose is that i dont want to invoke multiple http request for multiple names, Also i want fuzzy search name in output indicating that this document is for name alaistiar~ and this document is for name shanka~
I have loaded a csv file having 4 columns(Size-5GB.) with 100 milion records. .csv file has following column names -
'name', 'father_name', 'date_of_passing','admission_number'
I have created index on column 'name. To do this i have executed following curl request on managed-schema(solr-8.9.0, jdk-11.0.12)
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field":{"name":"name","type":"text_general","stored":true,"indexed":true }}' http://localhost:8983/solr/bigboxstore/schema
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field":{"name":"father_name","type":"text_general","stored":true,"indexed":false }}' http://localhost:8983/solr/bigboxstore/schema
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field":{"name":"date_of_passing","type":"pdate","stored":true,"indexed":false }}' http://localhost:8983/solr/bigboxstore/schema
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field":{"name":"admission_number","type":"text_general","stored":true,"indexed":false }}' http://localhost:8983/solr/bigboxstore/schema
Is this a right way to create index on 1 column(only on name) as described above?
Now i have list of 1 milion names. On each name i have to do fuzzy-search(column:name) on already loaded data. In the output, for each name I have to return list of java objects including all 4 columns of .csv file.
Note- In output I also have to include name which was supplied as input(in where clause).
For each name, i am doing fuzzy search as follows :
http://localhost:8983/solr/bigboxstore/select?indent=on&q=name:'alaistiar~'&wt=json.
To do this i have to execute 1 milion http request, which i dont want. Instead of executing 1milion http request, May i do in a single http request?
I understand that 'OR operator will not solve my problem because i will not able to group output documents on the basis of name which was passed as a input.

Yes, you could unify the queries by using "OR":
name:alaistiar~ OR name:shanka~
http://localhost:8983/solr/bigboxstore/select?indent=on&q=name:alaistiar~ OR name:shanka~&wt=json
You could omit "OR" if your default operator is "OR". The query would look like:
name:alaistiar~ name:shanka~
http://localhost:8983/solr/bigboxstore/select?indent=on&q=name:alaistiar~ name:shanka~&wt=json
Of course the "space" should be escaped in URL.
Hello, again. After you edited your question, it is more clear what you are looking for:
only 1 query for 1 mil names
see in the result which response to which name corresponds
There is a solutions, but you have to do some post processing. You can use a POST request with a json with params (for 1) and you can use hit highlighting (for 2) like I did here:
curl 'http://localhost:8983/solr/bigboxstore/query?hl.fl=name&hl.simple.post=</b>&hl.simple.pre=<b>&hl=on' -H "Content-Type: application/x-www-form-urlencoded" -X POST -d 'json={"query":"name:alaistiar~ name:shanka~"}'
The answer contains two parts: the first one with the result and the second one with the id and the highlights -> you will have to pair them up on id after receiving the response.

CouchDB - Mango Query to select records based on complex composite key

I have records with keys looking like this:
"001_test_66"
"001_testab_54"
"002_testbc_88"
"0020_tesgdtbc_38"
How can I query a couchdb database using Mango queries based on the first part of the key (001 or 002). The fourth one should fail if I search on '002'

You can use $regex operator described in chapter Condition Operators of CouchDB API Reference. In below example, I assumed _id to be the key you want to search by.
"selector": {
"_id": {
"$regex": "^001.*"
}
}
Here's an example using CURL (replace <db> with the name of your database).
curl -H 'Content-Type: application/json' -X POST http://localhost:5984/<db>/_find -d '{"selector":{"_id":{"$regex": "^001.*"}}}'

Solr add field to schema using curl

A:\DOS> curl -X POST -H 'Content-type:application/json' \
--data-binary '{"add-field":{"name":"timestamp","type":"date","indexed":true,"stored":true,\
"default":NOW,"multiValued":false}}' http://localhost:8983/solr/testt/schema
{
"reponseHeader":{
"status":0,
"QTime":0},
"errors":"no stream"}
I am trying to add a 'timestamp' field to solr and this is the error which I am getting. Can anyone help me figure out where I am wrong in this?

There may have two problem
&commit=true
At the end of URL commit parameter was not added.
schema.xml
Schema not contain field you wanted to set.

There is a problem with your curl command. Because if you try it differently it works
curl -d '#timestamp.json' -X POST -H 'Content-type:application/json' http://localhost:8983/solr/testt/schema
and create the timestamp.json file
{
"add-field":{"name":"timestamp","type":"date","indexed":true,"stored":true,"default":NOW,"multiValued":false}
}

Delete all documents from Solr that have a certain empty field

Querying for those documents works with: "fq=-myfield:[* TO *]".
But how can I delete all those? It seems that the delete syntax update?stream.body=<delete><query>... accepts only a query, no filters...

Only pass -myfield[* TO *] in query tag. Do not pass fq parameter. Then it will work I feel. Once I had to delete all documents with id that contained word "data" in the id field string, I just passed id:*data* between query tags, and it worked. Let me know if that helps you.

The correct answer should be: -myfield:* or even -myfield:[* TO *], but the : is mandatory.
This is is an example with curl:
curl http://localhost:8983/solr/collection/update \
-H "Content-Type: text/xml" \
--data-binary '<delete><query>-myfield:*</query></delete>'

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How can I export all results from Blazegraph into a file? - export

I would like to export the results of my SPARQL query from Blazegraph into a file. However, it exports only the first page of the results. When I try to display all results, my browser crashes. How can I fix this? I'm running Blazegraph 2.1.2 on a local cluster.

Related

Document is not returned when searched using query parameter in solr

Apache solr fuzzy search on list values

CouchDB - Mango Query to select records based on complex composite key

Solr add field to schema using curl

Delete all documents from Solr that have a certain empty field

Categories

Resources