We have an experimentation tool that uses YAML config files to run experiments and deploy models. We made this choice some time ago to integrate better with a Kubernetes orchestration.
Right now, we have hundreds of historical experiments, and we are stuck with trying to index them for querying. I have seen several questions about converting YAML files to json for indexing, but we would like to keep them as YAML. I found this YAMLDB from this question before. However this has no support for querying and isn't tied to python, which we'd like for inter-operability.
Would anyone have pointers to any repos or packages or libraries that do this (or perhaps mongo extensions if they exist). In-progress/alpha code is also okay.
Thank you.
I am new to graph databases, gremlin and tinkerpop. We are using them in an application we are building and the setup has been done by some other team.
Now when I try to run the gremlin queries provided in the tinkerpop documentation, many of them are not working and I am getting errors saying 'no signature of method:'.
Can you please guide me on what and how to check, either versions or anything else to make them work.
We are using janusgraph, cassandra as storage backend and elasticsearch for indexing.
Checking the version of Gremlin as you did was the right path to take. There may be minor differences between "z" versions of x.y.z and larger differences between "y" versions of 'x.y.z'. So for 3.2.3 you would want this documentation for TinkerPop:
http://tinkerpop.apache.org/docs/3.2.3/reference/
As of this writing, JanusGraph has not yet released a version with TinkerPop 3.3.0 support and my sense is that it is not quite as trivial as just bumping the version number. 3.3.0 introduced a number of changes that graph providers would likely have to deal with in the form of new test, revised semantics, class renaming, etc. It's not something you would likely be able to do on your own without prior knowledge to how JanusGraph works.
There does appear to be a pull request for 3.3.0 support however so you could try to build that if you'd like an early look at how it works. If not I suggest you consult the 3.2.3 documentation and simply write your Gremlin in that form. 3.3.0 doesn't really introduce a ton of major new Gremlin steps, so you aren't missing much - I think you only get limit() and better addE() semantics. I would be sure to consult javadocs of 3.2.6 for a full list of every Gremlin step that is deprecated so that when JanusGraph does release 3.3.0 support, you are in the best position to upgrade.
I have started to try to use the Google Cloud datalab. While I understand it is a Beta product, I find the Doc's very frustrating, to say the least.
The questions here and lack of responses as well as lack of new revisions or docs over the several months the project has been available make me wonder if there is any commitment to the product?
A beginning would be a notebook that shows data ingestion from external sources to both the datastore system and the Big query system. That is a common use case. I'd like to use my own data, it would be great to have a Notebook to ingest it. It seems that should be doable without huge effort? And it would get me (and others) out of this mess trying to link the various terse docs from various products and workspaces up and working together..
in addition to a better explanation of the Git hub connection process (prior question))
For BigQuery, see here: https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/tutorials/BigQuery/Importing%20and%20Exporting%20Data.ipynb
For GCS, see here: https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/tutorials/Storage/Storage%20Commands.ipynb
Those are the only two storage options currently supported in Datalab (which should not be used in any event for large scale data transfers; these are for small scale transfers that can fit in memory in the Datalab VM).
For Git support, see https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/intro/Using%20Datalab%20-%20Managing%20Notebooks%20with%20Git.ipynb. Note that this has nothing to do with Github, however.
As for the low level of activity recently, that is because we have been heads down getting ready for GCP Next (which happens this coming week). Once that is over we should be able to migrate a number of new features over to Datalab and get a new public release out soon.
Datalab isn't running on your local machine. Just the presentation part is in your browser. So if you mean the browser client machine, that wouldn't be a good solution - you'd be moving data from the local machine to a VM which is running the Datalab Python code (and this VM has limited storage space), and then moving it again to the real destination. Instead, you should use the cloud console or (preferably) gcloud command line on your local machine for this.
I am working on Adobe CQ. I created 2-3 versions(1.2,1.2,1.3) for a particular page in my author instance. Now I tried to package my content page and installed it in another instance. I couldn't see the versions of the page which I installed in another instance.
Can anyone help me out doing this?? I want to migrate my content pages along with their versions from one CQ instance to another??
We are in the same situation. You can extract prior version details using the packaging approach, but you will be precluded from reloading them in due to the new Oak security model. The next issue is that you would need to extract and transform the data, and then reinsert due to the node ID's potentially differing, especially if you are using partial data sets to extract.
Where we have gotten to, and are proving now, is to use the new migration tool to move content from instance to instance, which purportedly has a version extract tool. I will update details here when we get our results back.
UPDATE:
We have tested the CRX2OAK migration tool, and it indeed does move versions across. Using the tool, you can specify filters to only migrate a subset of content, which will then drag the version details across as well.
It seems this approach works quite well for both single tenancy and multi tenancy approaches as it used to using a package for content.
Unfortunately, it can't be used as a portable backup system, as it is an instance to instance solution. It does, however, work well for blue/green deployment strategies.
Versions are stored by path '/jcr:system/jcr:versionStorage' in AEM.
To transfer pages with their versions just create a package with filters for content which you want to move and the version storage path as well, download package and install in other AEM.
If anyone comes across this question like me, here is the summarised answer:
You can use crx2oak utility available from link below to migrate pages and page version across instances:
https://repo.adobe.com/nexus/content/groups/public/com/adobe/granite/crx2oak/
This is a powerful utility with multiple uses (especially in upgrades) as documented in links below:
https://docs.adobe.com/docs/en/aem/6-2/deploy/upgrade/using-crx2oak.html
https://jackrabbit.apache.org/oak/docs/migration.html
The source and destination repositories need to be offline while running this utility so best to plan ahead for this type of migration.
HTH
Is there a really good free tool for BugZilla reporting? I am finding the default search options on the web interface far too limiting. My biggest issue is with the lack of Order By options (only 1 field at a time, and a very limited set of fields to choose from). I have done some Google searches, but I can't find any good free BugZilla reporting tools.
If there isn't one, can someone please point me to an example on how to access the BugZilla web services? If I can get the BugZilla data, then I can easily build my own reports that will better meet our needs.
Take a look at this: http://www.faqs.org/docs/bugzilla/dbdoc.html
Use this database schema for reference: faqs.org/docs/bugzilla/dbschema.html
If you need a web-interface, use your favorite dynamic website scripting language that can access MySQL databases (say PHP)...
Simple-ish Tutorial: freewebmasterhelp.com/tutorials/phpmysql/4
PHP MySQL API Reference: php.net/manual/en/ref.mysql.php
Then use SQL queries such as:
"SELECT * FROM bugs WHERE WHERE bug_status != 'RESOLVED' ORDER BY creation_ts ASC, votes DESC LIMIT 50"
which lists first 50 entries of unresolved bugs ordered first ascending creation time then descending by number of votes.
I have used this in the past and have liked it a lot: http://www.mediawiki.org/wiki/Extension:Bugzilla_Reports
You can also consider other tool eg mantis
(http://www.mantisbt.org/)
I've personally switched from Bugzilla into Mantis and installed some plugins (http://deboutv.free.fr/mantis/) and found this more comfortable
If you are a Java user, you might want to check out Mylyn for eclipse. This is integrates a task-driven development approach into eclipse.
With that, you can raise bugs, tie together SVN changes and bugs, and hide classes that are not relevant to fixing bugs, etc. It's a bit involved to get started with, but quite powerful.
It also comes with a connector for BugZilla. See this introductory article for an example.
If you don't use eclipse, but you do use Java, then note that since Mylyn is open-source, you might want to look at the source code of the Mylyn BugZilla connector for how they do their work.
Good luck.
You can try Deskzilla (http://deskzilla.com/) - it is a multi-platform desktop client for Bugzilla with Outlook-like interface, rich reporting and filtering capabilities, offline work, drag-n-drop, etc. It's a commercial product, but if you're working on an Open Source project you can use it for free.
AFAIK Bugzilla uses MySQL database for storing data. So probably you can connect with some visual db manager (plenty of it exists, see Toad Data Modeler, DbVisualizer) and try do do some sql work...
There is a list of some add-ons (free and commercial) listed on the Buzilla addons wiki.
If you are a Windows user, MyZilla is a possible option.
Otherwise, to work toward your own, see the Bugzilla API documentation, which, in a way, includes how to retrieve the current schema (Bugzilla::DB::Schema), and Bugzilla::WebService.
Netbeans also has Bugzilla integration (I haven't tried it...).
I have analized a bunch of bug tracking tools.
You can try track or mantis, because bugzilla is very unfriendly about reporting.
Mantis
Mantis can export data in excel: all the graphic you need can be generated by that sheet.
For more information take a look to my blog:
http://gioorgi.com/2008/bug-tracking-mantis/
Anyway, Track is used a lot more, so for sake of completeness I should cite it:
Track
Pros:
Can Also work with an embedded database (using sqlite).
Easy to setup and use.
Cons:
Feature are too much, and aims to be also a CMS to some extend.
Take a look to:
http://gioorgi.com/2008/bug-tracking-trac/
Since Bugzilla can be installed on your own server, I presume the simplest way is to do that and play with the databases it creates ("Bugzilla supports MySQL, PostgreSQL and Oracle as database servers"). The documentation also says you can modify the templates as you like.
Otherwise one could try paid support or some other bug trackers.
I use this bookmarklet and like how it searches right with the strings entered in the location bar like smart search. It lets you quickly search bugzilla or jump to a bug number via Bugzilla Quicksearch, and is IE6+, Moz, Op7+ compatible.
Its companions on the same page can be used to refine or help with bug search/report, e.g. collect buglinks (queries bugzilla to show a list of bugs linked to from the current page),ord buglinkify (turns all numbers on the page into bug links).