Solr indexing support for NetCDF files? - solr

I am brand new to Solr and NetCDF, and am working on a project that is very much out of my realm of expertise. So I don't know where to look for the best information. I currently have an installation setup and for now I am browsing the directories and configuration files to get familiar. Although I found a few resources and tutorials that have given me a general understanding of how to work with Solr, I do not know how to translate this information specific to working with NetCDF.
Are there any guides, books, or resources that provide information specific to my case? Does Solr/Lucene even support NetCDF indexing by itself? I would appreciate any advice/suggestions/input you might have.
Thanks in advance!

I do not know about solr, but for NetCDF You can use standard programs to extract the data for indexing. Tools exist for C, JAVA and Python (You're probably not interested in FORTRAN), so just dump the necessary data to something that can be read by solr and that's it.

Related

Is there any python-based (or with api) library or tool (or repo) to index YAML-based config files with support for queries?

We have an experimentation tool that uses YAML config files to run experiments and deploy models. We made this choice some time ago to integrate better with a Kubernetes orchestration.
Right now, we have hundreds of historical experiments, and we are stuck with trying to index them for querying. I have seen several questions about converting YAML files to json for indexing, but we would like to keep them as YAML. I found this YAMLDB from this question before. However this has no support for querying and isn't tied to python, which we'd like for inter-operability.
Would anyone have pointers to any repos or packages or libraries that do this (or perhaps mongo extensions if they exist). In-progress/alpha code is also okay.
Thank you.

Replacing dtSearch with Lucene - Syntax

We are desperate to switch over to Lucene (via Solr), but one big issue we have is the syntax support.
dtSearch supports xfirstword, w/N, pre/N, and probably some others.
I think w/N can be ported to Lucene, but the other ones I have no idea how to port.
I did a search and found an article that claims they have made the switch--still using dtSearch syntax, but I have yet to get the source. I left a comment about getting the source, but no response yet.
What do you guys recommend?
We basically want Solr with dtSearch syntax.
Do you have any good articles on how to specifically add features to indexing, etc. needed to accomplish these features?
Since I wasn't able to find a good solution to this, I wrote a dtSearch parser in Antlr4.
Many of you have asked for it, so I've posted it to GitHub.
Here's the link:
https://github.com/blmille1/dtsearchparser

Files included from the default SolrConfig

I am trying to optimize solr.
The default solrConfig that comes with solr>collection1 has a lot of libs included I dont really need. Perhaps if someone could help we identifying the purpose. (I only import from DIH):
Please tell me whats in these:
contrib/extraction/lib
solr-cell-
contrib/clustering/lib
solr-clustering-
contrib/langid/lib/
solr-langid
contrib/extraction/lib
solr-cell-*
These are Solr Cell Libraries which integrates with Tika and helps you Index Rich documents e.g. Microsoft Word, Excel etc.
contrib/clustering/lib
solr-clustering-
Solr clustering is for the Clustering support integrated with Carrot.
Clustering would help you group documents, topic, entity extraction and much more.
contrib/langid/lib/
solr-langid
Solr Language Id for the Language detection. It adds the ability to detect the language of a document before indexing and then make appropriate decisions about analysis, etc.
Just exclude the jars if you are not using any of the above features and be sure you remove the mappings from the Solr configuration files as well.

Advice on learning about web applications

I know how to write programs in Java and C++, and would like to learn how servers, databases and Internet based applications work so I could start developing them.
Where should I start? What should I learn first? What books would you recommend for me?
Thank you, in advance :)
I would start by either trying Tomcat which would let you create fairly basic web applications. I would start by playing around with either servlettes or JSPs. There is a lot documentation and examples.
Or you could start by downloading and playing around with a database. PostgreSQL is really good. It is free and they have a tool called pgadmin which is a really good ide.
Once you have been able to get these set up and working I would then start taking a look at some various frameworks that exist to make using these tools a lot easier. For example, you could take a look at Guice or Spring for dependency injection or a range of other tools. This is a comparison of each.
You will also probably want to also look into Velocity, Freemarker, or struts, or something similar. These will make your life a lot easier.
For the database you could look at: Hibernate or MyBatis, both are really good and function slightly differentially. Hibernate is very commonly used and they cache objects very efficiently.
I don't know what you mean by "cells", anyway you may start from open source technologies and their online docs, like Apache, MySQL, and PHP.

Jar files for PDF generation through Java

Where can i get the following jar files from-:
adobe-livecycle-client.jar
adobe-usermanager-client.jar
adobe-utilities.jar?
How do i download these jar files?
You buy them. And from the looks of things, they're not going to be cheap.
If you are looking for a free, open-source solution for generating PDFs, the most widely-used solution is iText, available here.
well for java based PDF solutions...we dont have a clean way i guess-still.. all solutions are primitive and kind of workarounds... No easy solution for
1. Designing a template of a PDF
2. Then at runtime using java, populate data into this template...either using xml or other datasources...
such a simple requirement and NONE has a good "open-source and free" solution yet !
Eclipse BIRT comes close.. but does not handle Barcode elements ..OOB.

Resources