MLAB / Heroku - mongorestore undefined - database

According to the Mlab & MongoDb documentation, inserting an existing local BSON collection into a new cloud collection should be fairly straight forward.
Import collection
mongorestore -h <host_url> -d <db_name> -u <user> -p <password> <input .bson file>
https://www.mlab.com/databases/your_db_name#tools
I've successfully connected to the mongo shell, but:
a) I'm unable to verify the OS used per https://docs.mongodb.com/guides/server/import/#check-your-environment - ls returns [native code]
b) Running the command for mongorestore returns a lot of missing ; before statement SyntaxErrors (which I am able to fix despite the strangeness)
c) Running the command returns mongorestore is not defined
Depressingly, i'll have to spin up an endpoint in the cloud and post all my docs from local to cloud via that method.
However if anyone knows what the issue here is? Why would mongorestore (and incidentally mongoimport) not be available?

Related

cx_Oracle in Azure Databricks

I am unable to establish connection to my Oracle database from Azure Databricks although it works in ADF where I am able to query the table. But ADF takes time to filter the records so I am still trying to connect from Databricks.
I followed the steps from this Microsoft link, both manually and using init-script but error seems to persist.
When I looked into my cluster event log it says the init-script execution was successfully.
Error message when I tried to establish the connection:
DPI-1047: Cannot locate a 64-bit Oracle Client library: "/databricks/driver/oracle_ctl//lib/libclntsh.so: cannot open shared object file: No such file or directory".
When I executed the following command
dbutils.fs.ls("/databricks/driver/")
there was no such directory
This triggered me to post some questions here:
Does this mean the init-script did not perform its job?
Is /databricks/driver/oracle_ctl a hidden directory for dbutils.fs.ls?
Error message points to /databricks/driver/oracle_ctl//lib/libclntsh.so, when I manually inspected the downloaded oracle client, there is no such folder called lib although libclntsh.so exists in the main directory. Is there a problem that databricks is checking the wrong directory for the libclntsh.so?
Does this connections still works for others?
Syntax for connection: cx_Oracle.connect(user= user_name, password= password,dsn= IP+':'+Port+'/'+DB_name)
Above syntax works fine when connected from inside a on-premises machine.
Try installing the latest major release of cx_Oracle - which got renamed to python-oracledb, see the release announcement.
This version doesn't need Oracle Instant Client. The API is the same as cx_Oracle, although obviously the name is different.
If I understand the instructions, your init script would do something like:
/databricks/python/bin/pip install oracledb
Application code would be like:
import oracledb
connection = oracledb.connect(user='scott', password=mypw, dsn='yourdbhostname/yourdbservicename')
with connection.cursor() as cursor:
for row in cursor.execute('select city from locations'):
print(row)
Resources:
Home page: oracle.github.io/python-oracledb/
Quick start: Quick Start python-oracledb Installation
Documentation: python-oracle.readthedocs.io/en/latest/index.html
PyPI: pypi.org/project/oracledb/
Source: github.com/oracle/python-oracledb
Upgrading: Upgrading from cx_Oracle 8.3 to python-oracledb
Changed the path from "/databricks/driver/oracle_ctl/" to "/databricks/driver/oracle_ctl/instantclient" in the init-script and that error does not appear anymore.
Please use the following init script instead
dbutils.fs.put("dbfs:/databricks/<init-script-folder-name>/oracle_ctl.sh","""
#!/bin/bash
sudo apt-get install libaio1
wget --quiet -O /tmp/instantclient-basiclite-linuxx64.zip https://download.oracle.com/otn_software/linux/instantclient/instantclient-basiclite-linuxx64.zip
unzip /tmp/instantclient-basiclite-linuxx64.zip -d /databricks/driver/oracle_ctl/
mv /databricks/driver/oracle_ctl/instantclient* /databricks/driver/oracle_ctl/instantclient
sudo echo 'export LD_LIBRARY_PATH="/databricks/driver/oracle_ctl/instantclient/"' >> /databricks/spark/conf/spark-env.sh
sudo echo 'export ORACLE_HOME="/databricks/driver/oracle_ctl/instantclient/"' >> /databricks/spark/conf/spark-env.sh
""", True)
Notes:
The above init-script was advised by a databricks employee and can be found here.
As mentioned by Christopher Jones in one of the comments, cx_Oracle has been recently upgraded to oracledb with a thin and thick version.
You will get the above error if you don’t have Oracle instant client in your Cluster.
To resolve above error in azure databricks, please follow this code:
%sh
mkdir -p /opt/oracle
cd /opt/oracle
wget https://download.oracle.com/otn_software/nt/instantclient/19600/instantclient-basic-windows.x64-19.6.0.0.0dbru.zip
unzip instantclient-basic-windows.x64-19.6.0.0.0dbru.zip
set ORACLE_HOME=%ORABAS%\instantclient_19_3
set TNS_ADMIN=%ORACLE_HOME%
set PATH=%ORACLE_HOME%;%PATH%
To create init script, use the following code:
As per official doc,
dbutils.fs.put("dbfs:/databricks/<init-script-folder>/oracle_ctl.sh","""
#!/bin/bash
wget --quiet -O /tmp/instantclient-basiclite-linuxx64.zip https://download.oracle.com/otn_software/linux/instantclient/instantclient-basiclite-linuxx64.zip
unzip /tmp/instantclient-basiclite-linuxx64.zip -d /databricks/driver/oracle_ctl/
sudo echo 'export LD_LIBRARY_PATH="/databricks/driver/oracle_ctl/"' >> /databricks/spark/conf/spark-env.sh
sudo echo 'export ORACLE_HOME="/databricks/driver/oracle_ctl/"' >> /databricks/spark/conf/spark-env.sh
""", True)
To read data from oracle database in PySpark follow this article by Emrah Mete
For more information refer this official document:
https://docs.databricks.com/data/data-sources/oracle.html#oracle

How to duplicate database in ArangoDB when you can't run arangodump

I want to make a clone/duplicate of a database I have in ArangoDB. This https://stackoverflow.com/a/27827457 is one way I saw to do it, but it doesn't work for me because I can't run arangodump or any of the other Arango commands (like arangosh, arangorestore, etc.).
Also, why can't I run arangodump? This answer https://stackoverflow.com/a/63074313 says to "Open terminal and use cd to go to the directory in which arangoimport.exe is stored", but I can't find arangoimport.exe anywhere.
I looked on the ArangoDB website already, but I couldn't find any info.
If you don't have access to arangodump and arangorestore on server, then easiest way to invoke them is via docker and access your server by adding --server.endpoint option, you'll need to map some volume/directory to container to preserve dumped data for restoring them in other container, something like this:
#dump data to /tmp/dump at your host
docker run -it --rm -v /tmp/dump:/dump arangodb/arangodb:3.7.6 arangodump --server.endpoint http+tcp://192.168.1.2:8529
#restore data from /tmp/dump at your host
docker run -it --rm -v /tmp/dump:/dump arangodb/arangodb:3.7.6 arangorestore --server.endpoint http+tcp://192.168.1.2:8529
documentation of all available options, including examples are here for arangodump and here for arangorestore
other option is to write your own implementation of dump and restore utilizing ArangoDB REST APIs, but that's hefty and error prone task comparing to installing docker and then running provided dump and restore tools

AIP backup - using Docker

I am using the cloned dspace 6-x branch and installed it via docker. Can someone help me with the backup of my local database (Communities, collections, items)to a remote database?
According to the documentation we need to use the command:
dspace packager -s -t AIP -e eperson -p parent-handle file-path
But it returns an error: dspace is not a command
Anyone could help me transfer my local database to my remote repo?
Thanks!
Moving publications to a new repository will be a more substantial undertaking!
But your recent problem seems just that you are either not on the right container or in the right directory for executing the dspace command. Thus it is "not found". Make sure to execute dspace on the dspace container and specify the right/complete path. The dspace command is located in
/path/to/your/dspace-deployement-directory/bin.

DB reset after deploy to meteor servers

I re-deployed my app to meteor by using 'meteor deploy '
and my Database was reset.
Any clue why this happened or how I can avoid it in the future ?
When a meteor app is deployed, your data saved in local mongo would not deployed to the server. So you could use mongodump and mongorestore to solve it:(docs)
Now first dump your database somewhere
mongodump --host localhost:3001
Get your mongodb`s credentials by running (in your app dir):
meteor mongo myapp.meteor.com --url
This will give you database url in the form:
mongodb://username:password#host:port/databasename
With these info you could fill them into mongorestore (docs) and restore your local database over
mongorestore -u username -p password -h host:port -d databasename ~/desktop/location_of_your_mongodb_dump
All of your data would get transferred in this way. I wish it could help.

How can I attach a database to an app in Heroku?

I'm using Heroku's Postgres addon, and I created a new production database from the Heroku Postgres addon page.
I Didn't add it directly to my App using the Resources page of my App.
Now I want to attach this database to my App so it'll be recognized by the heroku pg command.
I'm able to use the database btw after setting the DATABASE_URL config var of my app to point to it, but heroku pg command doesn't recognize it yet.
Additional info: The previous database was Shared, and the new one is a Production.
Heroku add-ons may now be attached across applications and multiple times on a single app.
heroku addons:attach ADDON_NAME -a APP_NAME
Source: https://devcenter.heroku.com/changelog-items/646
To know the name of your addon, do:
heroku addons
Source: https://devcenter.heroku.com/articles/managing-add-ons
Did you add the database using the app-independent https://postgres.heroku.com/ site? Or did you just create a postgresql database in your Heroku control panel?
If you created your database on https://postgres.heroku.com/, you will not see the database via your heroku pg:info command. What you can do to add your database to your application, however, would be to:
Log into https://postgres.heroku.com/.
Click on the database you want to attach to your application.
Under 'Connection Settings', click the configuration button at the top right.
Then click the 'URL' option.
Copy your database URL, this should be something like "postgres://blah:blah#ec2-23-23-122-88.compute-1.amazonaws.com:5432/omg".
In your application, on the command line, run heroku config:set DATABASE_URL=postgres://blah:blah#ec2-23-23-122-88.compute-1.amazonaws.com:5432/omg
What we did there was assign your database to the DATABASE_URL environment variable in your application. This is the variable that's used by default when you provision databases locally to your application, so theoretically, assigning this value should work just fine for you.
To get your database that you created at https://postgres.heroku.com/ attached to your actual heroku app that you are working on you can't use any of the pg backup commands and as far as I can tell there is no supported Heroku way of attaching a database to a heroku app.
You can however create a backup of your database using pg_dump and then use pg_restore to populate your new database that is attached to your app:
pg_dump -i -h hostname -p 5432 -U username -F c -b -v -f "backup-filename" database_name
Once that is complete you can populate your new database with:
pg_restore -i -h new_hostname -p 5432 -U new_username -d new_database_name -v "same_backup_filename"
Even if you are upgrading from the "basic plan" to a the "crane plan" you still have to do a backup and restore, but since the db's are already attached to your app you have the advantage of using the heroku backup commands.

Resources