Hadoop dfs crashes when I start the Datanode and Namenode - batch-file

I just downloaded, installed, and configured Hadoop version 3.3.4 but I think I configured something wrong.
The issue:
I ran start-yarn.cmd to get the DataNode and NameNode to start. Running the command start-yarn.cmd opened two more cmd windows and both displayed a message about this being a deprecated script. I've attached the two responses from the nodes as txt files in a google drive folder shared at the bottom of this post. The last part talks about how the node suddenly shut down:
2023-02-13 20:42:57,817 INFO util.ExitUtil: Exiting with status 1: org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 0, volumes configured: 1, volumes failed: 1, volume failures tolerated: 0
2023-02-13 20:42:57,833 INFO datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at Poseidon/192.168.56.1
************************************************************/
Could there be something wrong with my configuration of the 4 xml -site files? (core-site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml) and maybe the Hadoop environment cmd file (hadoop-env.cmd) all located in the D:\Hadoop\etc\hadoop directory? I've attached these as well in the google drive folder.
What I tried is listed below:
I have created some data in a file called Customers.csv in the directory D:\Users\Jacob Glik\IdeaProjects\Test and wanted to upload the file to my Hadoop cluster. Hadoop is in the directory: D:\Hadoop\bin.
I navigated to the directory: D:\Users\Jacob Glik \IdeaProjects\Test and entered the command: hadoop fs -put Customers.csv hdfs://localhost:9000/user/Project1/data into cmd.
CMDs response:
put: Call From Poseidon/192.168.56.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused: no further information; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
I think this ether means I have a firewall problem (unlikely as Hadoop is on my machine) or the nodes aren't running. I ran jps to check if the nodes were running and got the following as a response:
7584
22844 Launcher
25980 Jps
I think this means the nodes aren't running. I then ran start-yarn.cmd to get the DataNode and NameNode to start, and this is where things crashed. Running the command start-yarn.cmd opened two more cmd windows and both displayed a message about this being a deprecated script. I've attached the whole message from both nodes in txt format to this post.
Also when I set up Hadoop on my Windows 10 computer I used this youtube tutorial: "https://www.youtube.com/watch?v=GfixwKmS8Ro&ab_channel=ProgrammingEpitome" in which they told me to delete the bin directory that came with the instillation of Hadoop and use their own version. Perhaps my bin is also messed up. I'm out of ideas so any help and insight is much appreciated!
/
/
/
I will link to my google drive where I've uploaded the two reports from the nodes, the four xml files and the env cmd script. Thank you in advance! https://drive.google.com/drive/folders/1fb22xZOYaZa50boLIIAKJjRC-lCTxFmf?usp=sharing
/
/
/
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///D:/Hadoop/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/D:/Hadoop/data/datanode</value>
</property>
</configuration>
in the hadoop-env.cmd file I changed the java destination, and the user (my username has a space)
set JAVA_HOME=C:\Java
...
set HADOOP_IDENT_STRING="Jacob Glik"

Related

Error creating core in Solr- Solr fail to start

I encountered a problem that I could not figure out for the past week.
After installing another java JDK (Eclipse JDK) and changing the JAVA_HOME value in the computer environment, my Solr service failed to start.
I tried everything, including format to my PC, and new installation as we do every time on a new PC, and I still encounter the same problem.
This is the command the system runs to start the Solr service:
Starting java -Xms3000m -Xmx3000m -verbose:gc -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark -XX:PretenureSizeThreshold=64m -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000 -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -XX:-OmitStackTraceInFastThrow -DSTOP.PORT=7984 -DSTOP.KEY=mysecret "-Dsolr.install.dir=C:\Program Files\Morphisec Server Dev\solr-6.3.0-ssl\server.." -Djetty.host=0.0.0.0 -Djetty.port=8984 -Dsolr.jetty.https.port=8984 "-Djetty.home=C:\Program Files\Morphisec Server Dev\solr-6.3.0-ssl\server\solr" -Dsolr.autoSoftCommit.maxTime=10 "-Dsolr.log.dir=C:\Program Files\Morphisec Server Dev\solr-6.3.0-ssl\server\logs" -Dsolr.autoCommit.maxTime=60000 -Dsolr.ssl.checkPeerName=false "-Djavax.net.ssl.trustStore=C:\Program Files\Morphisec Server Dev\solr-6.3.0-ssl\server\etc\keystore.solr.jks" -Djavax.net.ssl.trustStorePassword=DEA7B39145F6478C -DzkClientTimeout=15000 -DzkRun -jar start.jar --module=https -DurlScheme=https
All the schemas in the folders of Solr do not contain any use of intPointField as mentioned, only TrieField…
We use Solr 6.3.0.
When I enter the Solr UI, into CloudàTreeà the /configs and choose one of my collections, I have a managed-schema file that does not appear in the directory on my PC, and there is a use of intPointField:
I don’t know where it gets it from. (As I mentioned, I even formatted the PC)
This is the log I get for each collection creation fail:

Error in Google App Engine - Log Service - SQLite

I am using Google App Engine on Ubuntu within Linux Subsystem for Windows.
When I start dev_appserver.py I receive errors with the following line resulting in this, which I am understanding to be a corrupted sqlite data file.
File "/../google-cloud-sdk/platform/google_appengine/google/appengine/api/logservice/logservice_stub.py", line 181, in start_request
host, start_time, method, resource, http_version, module))
DatabaseError: database disk image is malformed
Based upon this post I am understanding there is a log.db referenced.
GoogleAppEngineLauncher: database disk image is malformed
However, when I run the script referenced, the resultant path does not contain a log.db leading me to believe this is a different issue.
Any help in identifying the appropriate database, for the purposes of removing, would be appreciated.
Per comment added --clear_datastore=1 and did not notice a change
dev_appserver.py --host 127.0.0.1 --port 8080 --admin_port 8082 --storage_path=temp/storage --skip_sdk_update_check true --clear_datastore=1 main/app.yaml main/sync.yaml

Get catalina.out in Apache tomcat log

Currently trying to open the file gives this error:
C:\Users\....\....\apache-tomcat-8.0.45\logs>catalina.out
The process cannot access the file because it is being used by another process.
What I have done is to have application running in the webapps and start tomcat by using following command:
catalina.bat jpda start
And now I want to see the logs in windows. In Ubuntu, tail -f catalina.out can be used. But how to see tomcat logs in windows forcefully?
As stated in answers here, you can try more catalina.out or type catalina.out

I can not run show databases; command on terminal for hive

When I write
> show databases;
in Hive, I get the following error;
FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Can you please provide a solution for this?
Run this command sub hive directory;
bin/schematool -initSchema -dbType derby
So, make sure the services are started;
start-all.sh
this command run.
It could be due to the default setting:/user/hive/warehouse (in the hive-site.xml) is not properly created or permission granted. (pls note this is **user, not usr)
which may be the culprit if you are doing a manual setup!
1) You may first check out the hive-site.xml (located at $HIVE_HOME/conf in my case is /usr/local/hive/conf) if you want, but which is the initially set default anyway
2) check if the path in Hadoop using: hadoop fs -ls /user/hive/warehouse exists or not?
3) create the Hadoop folder by using: hadoop fs -mkdir /usr/hive/wawrehouse if non-existing, take a look at the access right using Hadoop fs -ls ...............
4) use Hadoop fs -chmod g+w /usr/................. to grant the needed right
Either the user vs usr, or the set up of the warehouse, could be common causes
Reference (from hive-site.xml):
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
Note: you also have to make sure another Hadoop folder /tmp is also properly set as above

tomcat is throwing "deploy upload fail - no space left" while deployment of war file

I am trying to deploy a war file on tomcat server via tomcat manager. but each time I am getting this error:
FAIL - Deploy Upload Failed, Exception: Processing of multipart/form-data request failed. No space left on device
There is plenty of space on the server. I have only tomcat server and MySQL server running on the server, nothing else.
Can anyone tell me what is wrong and what is the workaround?
It might be you have a permission problem here.
You can try
chmod 777 /opt/apache-tomcat-X.X.XX/work/Catalina/localhost/manager
and maybe additional
chmod 777 /opt/apache-tomcat-X.X.XX/webapps/sample
where sample is your app directory ofc.

Resources