I am working in a Dataproc cluster setup to test all the features. I have created a cluster template and I've been using the creation command almost every day, but this week it stopped working. The error printed is:
ERROR: (gcloud.dataproc.clusters.create) Operation [projects/cluster_name/regions/us-central1/operations/number] failed: Failed to initialize node cluster_name-m: Component ranger failed to activate post-hdfs See output in: gs://cluster_bucket/google-cloud-dataproc-metainfo/number/cluster_name-m/dataproc-post-hdfs-startup-script_output.
When I see the output that the message print, the error found is:
<13>Mar 3 12:22:59 google-dataproc-startup[15744]: <13>Mar 3 12:22:59 post-hdfs-activate-component-ranger[15786]: ERROR: Error CREATEing SolrCore 'ranger_audits': Unable to create core [ranger_audits] Caused by: Java heap space
<13>Mar 3 12:22:59 google-dataproc-startup[15744]: <13>Mar 3 12:22:59 post-hdfs-activate-component-ranger[15786]:
<13>Mar 3 12:22:59 google-dataproc-startup[15744]: <13>Mar 3 12:22:59 post-hdfs-activate-component-ranger[15786]: + exit_code=1
<13>Mar 3 12:22:59 google-dataproc-startup[15744]: <13>Mar 3 12:22:59 post-hdfs-activate-component-ranger[15786]: + [[ 1 -ne 0 ]]
<13>Mar 3 12:22:59 google-dataproc-startup[15744]: <13>Mar 3 12:22:59 post-hdfs-activate-component-ranger[15786]: + echo 1
<13>Mar 3 12:22:59 google-dataproc-startup[15744]: <13>Mar 3 12:22:59 post-hdfs-activate-component-ranger[15786]: + log_and_fail ranger 'Component ranger failed to activate post-hdfs'
The creation command that I ran is:
gcloud dataproc clusters create cluster_name \
--bucket cluster_bucket \
--region us-central1 \
--subnet subnet_dataproc \
--zone us-central1-c \
--master-machine-type n1-standard-8 \
--master-boot-disk-size 500 \
--num-workers 2 \
--worker-machine-type n1-standard-8 \
--worker-boot-disk-size 1000 \
--image-version 2.0.29-debian10 \
--optional-components ZEPPELIN,RANGER,SOLR \
--autoscaling-policy=autoscale_rule \
--properties="dataproc:ranger.kms.key.uri=projects/gcp-project/locations/global/keyRings/kbs-dataproc-keyring/cryptoKeys/kbs-dataproc-key,dataproc:ranger.admin.password.uri=gs://cluster_bucket/kerberos-root-principal-password.encrypted,hive:hive.metastore.warehouse.dir=gs://cluster_bucket/user/hive/warehouse,dataproc:solr.gcs.path=gs://cluster_bucket/solr2,dataproc:ranger.cloud-sql.instance.connection.name=gcp-project:us-central1:ranger-metadata,dataproc:ranger.cloud-sql.root.password.uri=gs://cluster_bucket/ranger-root-mysql-password.encrypted" \
--kerberos-root-principal-password-uri=gs://cluster_bucket/kerberos-root-principal-password.encrypted \
--kerberos-kms-key=projects/gcp-project/locations/global/keyRings/kbs-dataproc-keyring/cryptoKeys/kbs-dataproc-key \
--project gcp-project \
--enable-component-gateway \
--initialization-actions gs://goog-dataproc-initialization-actions-us-central1/cloud-sql-proxy/cloud-sql-proxy.sh,gs://cluster_bucket/hue.sh,gs://goog-dataproc-initialization-actions-us-central1/livy/livy.sh,gs://goog-dataproc-initialization-actions-us-central1/sqoop/sqoop.sh \
--metadata livy-timeout-session='4h' \
--metadata "hive-metastore-instance=gcp-project:us-central1:hive-metastore" \
--metadata "kms-key-uri=projects/gcp-project/locations/global/keyRings/kbs-dataproc-keyring/cryptoKeys/kbs-dataproc-key" \
--metadata "db-admin-password-uri=gs://cluster_bucket/hive-root-mysql-password.encrypted" \
--metadata "db-hive-password-uri=gs://cluster_bucket/hive-mysql-password.encrypted" \
--scopes=default,sql-admin
I know that it's something related to Ranger/Solr setup, but I don't know how to increase this heap size without creating an alternative initialization script or creating a custom machine image. If you have any idea of how to solve this or need more information about my setup please let me know.
This could be a wrong character in a username in SSH configuration for the Dataproc Project, which the hadoop fs -mkdir -p command failed in the HDFS activation step during cluster creation.
You can follow these steps for a solution.
Create the cluster using command line within GCP.
Use the tag --metadata=block-project-ssh-keys=true \ below when
creating a cluster, even if metadata tags exist for other purposes
and still add this tag at the end.
You can see this example.
gcloud dataproc clusters create cluster_name \
--metadata=block-project-ssh-keys=true \
--bucket cluster_bucket \
--region us-central1 \
--subnet subnet_dataproc \
………..
This is not required as this feature is enabled by default:
--initialization-actions gs://goog-dataproc-initialization-actions-us-central1/cloud-sql-proxy/cloud-sql-proxy.sh,gs://cluster_bucket/hue.sh,gs://goog-dataproc-initialization-actions-us-central1/livy/livy.sh,gs://goog-dataproc-initialization-actions-us-central1/sqoop/sqoop.sh
\
In the cloud SQL for MySQL, enable the flag log_bin_trust_function_creators to ON. Setting this flag controls whether stored function creators can be trusted not to create stored functions that will cause unsafe events to be written to the binary log. You can reset the flag to OFF after cluster creation. Please try and let me know if you see issues.
Also, please share the error log for more details.
Related
I tried to run my first cluster as I'm currently trying to learn so I can work in Cloud Engineering hopefully.
What I did :
I have 3 Cloud Servers ( Ubuntu 20.04), all in one Network,
I've successfully set up my ETCD Cluster ( cluster-health shows me all 3 Network IPs of the Servers, 1 leader 2 not leader)
now I've installed k3s on my first Server
curl -sfL https://get.k3s.io | sh -s - server \ --datastore-endpoint="https://10.0.0.2:2380,https://10.0.0.4:2380,https://10.0.0.3:2380"
I've done the same on the 2 other Servers the only difference is I added the token value to it and checked it beforehand in:
cat /var/lib/rancher/k3s/server/token
now everything seems to have worked but when I tried to kubectl get nodes , it just shows me one node...
does anyone have any tips or answers for me?
k3s Service FIle :
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
After=network-online.target
[Install]
WantedBy=multi-user.target
[Service]
Type=notify
EnvironmentFile=-/etc/default/%N
EnvironmentFile=-/etc/sysconfig/%N
EnvironmentFile=-/etc/systemd/system/k3s.service.env
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
ExecStartPre=/bin/sh -xc '! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service'
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s \
server \
'--node-external-ip=78.46.241.153'
'--node-name=node-1'
' --flannel-iface=ens10'
' --advertise-address=10.0.0.2'
' --node-ip=10.0.0.2'
' --datastore-endpoint=https://10.0.0.2:2380,https://10.0.0.4:2380,https://10.0.0.3:2380' \
$ /usr/pgsql-12/bin/pg_upgrade \
> -b /usr/pgsql-1
pgsql-10/ pgsql-12/
> -b /usr/pgsql-10/bin/ \
> -B /usr/pgsql-12/bin/ \
> -d /var/lib/pgsql/1
10/ 12/
> -d /var/lib/pgsql/10/data/ \
> -D /var/lib/pgsql/12/data/ \
> --check
Performing Consistency Checks
-----------------------------
Checking cluster versions ok
Checking database user is the install user ok
Checking database connection settings ok
Checking for prepared transactions ok
Checking for system-defined composite types in user tables ok
Checking for reg* data types in user tables ok
Checking for contrib/isn with bigint-passing mismatch ok
Checking for tables WITH OIDS ok
Checking for invalid "sql_identifier" user columns ok
Checking for presence of required libraries fatal
Your installation references loadable libraries that are missing from the
new installation. You can add these libraries to the new installation,
or remove the functions using them from the old installation. A list of
problem libraries is in the file:
loadable_libraries.txt
Failure, exiting
[postgres#localhost ~]$ cat loadable_libraries.txt
could not load library "$libdir/ltree": ERROR: could not access file "$libdir/ltree": No such file or directory
Database: ___
Database: ___
could not load library "$libdir/pg_trgm": ERROR: could not access file "$libdir/pg_trgm": No such file or directory
Database: ___
Database: ___
could not load library "$libdir/uuid-ossp": ERROR: could not access file "$libdir/uuid-ossp": No such file or directory
Database: ___
Database: ___
Valid steps for upgrade from postgres 10 to 12 will be highly appreciated. As I didn't found any highly reviewed link that is complete.
I am currently following with this link: https://www.postgresql.r2schools.com/how-to-upgrade-from-postgresql-11-to-12/. Replaced 10 for 11 in every command.
Thanks in Advance.
You can run the command:
sudo dnf install postgresql12-contrib
I'm trying to take a PostgreSQL backup with pg_dump. But I'm not able to take it due to the following error.
I have successfully taken backups for different IP addresses without the special character # in it.
command used and working
sudo /usr/bin/pg_dump --file "/home/myusername/my_first_db.backup" \
--verbose --format=t --blobs -v \
--dbname postgresql://postgres:myfirstpassowrd#112.112.113.114:5432/my_first_db
command used and not working
sudo /usr/bin/pg_dump --file "/home/myuser/xyz_db/DB_BACKUP/db_file.backup" \
--verbose --format=t --blobs -v \
--dbname postgresql://111.222.333.444:5432/prod_live?password=123th#123th4&user=postgres
sudo /usr/bin/pg_dump --file "/home/myuser/xyz_db/DB_BACKUP/db_file.backup" \
--verbose --format=t --blobs -v \
--dbname postgresql://111.222.333.444:5432/prod_live?password=123th%40123th4&user=postgres
Error I'm getting:
[4] 8555
myuser#myuser:~$ pg_dump: [archiver (db)] connection to database "prod_live" failed: FATAL: password authentication failed for user "root"
FATAL: password authentication failed for user "root"
I cannot change the password, because it is production.
As I can see...
Unquoted character & in your command line sends the task to background as described, for example, here: Linux: Start Command In Background. So anything after & character ignored (or interpreted as separate command) by *nix shell.
Solution
Just try to quote the whole string like this:
sudo /usr/bin/pg_dump --file "/home/myuser/xyz_db/DB_BACKUP/db_file.backup" \
--verbose --format=t --blobs -v \
--dbname 'postgresql://111.222.333.444:5432/prod_live?password=123th#123th4&user=postgres'
Explanation
In the output provided by you the line [4] 8555 means Background job #4 with process ID 8555 was started
And single quotes around the string allows to interpret it "as-is", without parameters substitution and other special characters interpreting.
PS: Use $'...' syntax to translate special escaped characters like \n \t \uxxxx and others.
There is several examples:
$ echo abc&defgh
[1] 3426
abc
defgh: command not found
[1]+ Done echo abc
As you can see the output is like to provided by you in the part [x] xxxx
$ echo 'abc&defgh'
abc&defgh
In this case command echo prints exactly what you want
And last but not least:
$ echo '1: abc&\ndefgh'; echo $'2: abc&\ndefgh'
1: abc&\ndefgh
2: abc&
defgh
Currently I am doing a project and I am stuck. It will be helpful if anyone could help me with it. I am using Linux - Ubuntu system. I using Zenity as my GUI for the book inventory system i am creating. The problem i facing now is i do not know how to transfer the data collected via Zenity --forms to the BookDB.txt .
===================================================================
zenity --forms --title="New book" --text="Add new book" \
--add-entry="Title" \
--add-entry="Author" \
--add-entry="Price" \
--add-entry="Quantity Available" \
--add-entry="Quantity sold"
read title
read author
read price
read QtyA
read QtyS
echo $title:$author:$price:$available:$sold >> BookDB.txt
echo $BookDB "New book title ' $title ' added successfully "
===================================================================
Thanks alot for your help!
zenity outputs to stdout. With a form, the fields are separated, by default, with a pipe. You'll want to do this:
data=$(
zenity --forms --title="New book" --text="Add new book" \
--add-entry="Title" \
--add-entry="Author" \
--add-entry="Price" \
--add-entry="Quantity Available" \
--add-entry="Quantity sold"
)
case $? in
1) echo "you cancelled"; exit 1 ;;
-1) echo "some error occurred"; exit -1 ;;
0) IFS="|" read -r title author price qtyA qtyS <<< "$data" ;;
esac
If you're not comfortable using pipe as the output separator, there's a --separator option. For example, you might want to use the "FS" character: --separator=$'\034', then IFS=$'\034' read -r a b c d e <<<"$data"
I'm finding the documentation for zenity pretty slim, but here's the official manual: https://help.gnome.org/users/zenity/stable/
I'm trying to do a loop in a make to exec a remote ssh command to optain the pid of a process to kill it.
Like this:
target:
for node in 23 ; do \
echo $$node ; \
ssh user#pc$$node "~/jdk1.6.0_31/bin/jps | grep CassandraDaemon | awk '{print \$$1}'" > $(PID); \
ssh user#pc$$node "kill -9 $(PID); \
done
But I get:
/bin/sh: 3: Syntax error: ";" unexpected
The issue I think is to store the pid that the remote ssh command returns (it woks well without the > $(PID) )
> redirects into files, not into variables. $() captures in a way you can assign to variables... but is also make syntax, so you need to escape it. You also need to escape it when you use it so that you don't get the make variable instead (no, you can't store it in a make variable).
for node in 23 ; do \
echo $$node ; \
PID=$$(ssh user#pc$$node "~/jdk1.6.0_31/bin/jps | grep CassandraDaemon | awk '{print \$$1}'"); \
ssh user#pc$$node "kill -9 $$PID; \
done
(assuming one of your many, many edits hasn't changed things too much from when I copied and pasted that to fix it...)