Tesseract.js not working as node package to my ReactJS project - reactjs

Tesseract initializes fine until it needs to load the language files, and it just stops working. See the attached picture for reference on the error..
The npm package(?) installs fine, I also downloaded offline files (worker and wasm files) and made it work as I have seen that it loads them correctly.. Well, at least until it starts loading the language files and breaks my app..
Worker and wasm files are put in the
/public
folder so it can be read by the jsx. I tried not using the offline files, by removing these lines
workerPath: '/External/tesseractjs_data/js/worker.min.js',
corePath: '/External/tesseractjs_data/js/tesseract-core.wasm.js',
but I am still having the same error. All of the solutions I have seen online that is connected to this problem are almost all in java, and one of the solution needs to install some kind of tesseract software, but what I would want to avoid this as I wanted no installations, why I have picked web programming so the installation would be minimal..

I don't think anyone will need this but here is how I fixed my issue:
Seems like my downloader (IDM) was capturing the language files (traineddata.gz) and then sets a key with 0 value in indexed db on the domain / browser.
Clear browser cache, or just delete the key/value pair thingy in the Indexed DB, which can be found on the developer tools / console thingy, at the "Storage" of the browser
Disable downloader or just remove ".gz" on the file types capturing section of the downloader
It should now work

Related

Appengine runs a stale version of the code -- and stack traces don't match source code

I have a python27 appengine application. My application generates a 500 error early in the code initialization, and I can inspect the stack trace in the StackDriver debugger in the GCP console.
I've since patched the code, and I've re-deployed under the same service name and version name (i.e. gcloud app deploy --version=SAME). Unfortunately, the old error still comes up, and line numbers in the stack traces reflect the files in the buggy deployment. If I use the code viewer to debug the error, I am however brought to the updated patched code in the online viewer -- and there is a mismatch. It behave as if the app instance is holding on to a previous snapshot of the code.
I'm fuzzy on the freshness and eventual consistency guarantees of GAE. Do I have to wait to get everything to serve the latest deployed version? Can I force it to use the newer code right away?
Things I've tried:
I initially assumed the problem had to do with versioning, i.e. maybe requests being load-balanced between instances with the same version, but each with slightly different code. I'm a bit fuzzy on the actual rules that govern which GAE instance gets chosen for a new request (esp whether GAE tries to reuse previous instances based on a source IP). I'm also fuzzy on whether or not active instances get destroyed right away when different code is redeployed under the same version name.
To take that possibility out of the equation, I tried pushing to a new version name, and then deleting all previous versions (using gcloud app versions list to get the list). But it doesn't help -- I still get stack traces from the old code, despite the source being up to date in the GCP console debugger. Waiting a couple hours doesn't do anything either.
I've tried two things:
disabling and re-enabling the application in GAE->Settings
I'd also noticed that there were some .pyc files uploaded in the snapshot, so I removed those and re-deployed.
I discovered that (1) is a very effective way to stop all running appengine instances. When you deploy a new version of a project, a traffic split is created (i.e. 0% for the old version and 100% for the new), but in my experience old instances might still be running if they've been used recently (despite them being configured to receive 0% of traffic). Toggling kills them all immediately. I unfortunately found that my stale code was still being used after re-enabling.
(2) did the trick. It wasn't obvious that .pyc were being uploaded. I discovered it by looking at GCP->StackDriver->Debug and I saw .pyc files in the tree snapshot.
I had recently updated my .gitignore to ignore locally installed pip runtime dependencies for the project (output of pip install -t lib requirements.txt). I don't want those in git, but they do need to ship as part of my appengine project. I had removed the #!.gitignore special include line from .gcloudignore. However, I forgot to re-add *.pyc into my .gcloudignore.
Another way to see the complete set of files included in an app deployment is to increase the verbosity to info on the gcloud app deploy command -- you see a giant json manifest with checksums. I don't typically leave that on because it's hard to visually inspect, but I would have spotted the .pyc in there.

Security of external jar file in GWT

I have created a GWT project which is successfully using an external jar file (see GWT - Using external jars / Java Projects by Lars Vogel‎ and Adding external jar in GWT).
When I use a library file like this, what happens when I compile the project and upload it to AppEngine? Does the jar file get uploaded as it is, or does it get compiled into something else first? And if the former, is it at any security risk of being downloaded without my control?
Let's drop the "google-app-engine" part, it doesn't matter here. You use the library in GWT, on client side. App Engine is server side, with no direct connection to GWT (but due to the volume restrictions it is quite useful to utilize some client side execution like GWT).
Everything you use in GWT will be compiled to JavaScript, transferred to the client and executed there. Obviously you have no control over the result and what the client does with it.
But it will be next to unreadable. Plus the client does not get the JAR per se and he does not get everything that is inside the JAR.
So what really matters is if the library's license allows this and if there are secrets in the library code that are only intended to be used on server side.
Actually, his's answer is not quite correct. The "google-app-engine"-part matters a lot here. Technically, GWT compiles and obfuscates all of the Java code it needs. And it strips out everything that it doesn't need. So, from the JavaScript generated by GWT, it should indeed be quite impossible to reconstruct or maybe even recognize the library. But it turns out that if you use the Eclipse plugin to deploy your app, appcfg uploads all sorts of random stuff to the AppEngine servers, sometimes including the entire Java source of the project (client side code included).
To see what exactly it uploads when you do a deploy, check in your system's temp-directory while the upload is running. You will find an AppEngine staging directory there that contains everything to be sent.
For suggestions for ways around this, you can refer to the answers to a question that I asked earlier: Removing unwanted uploads from AppEngine deployment
What I haven't checked is whether all the unwanted uploaded files end up in directories that are actually directly accessible from the internet.

Setting up an Ant script to upload files

I've run over only a few examples of how to do this and they didn't work for me. Mainly since i've only used an ant script to auto build jar files threw jenkins. Now though i need to build those files in jenkins then upload them to a 3rd party file site like sourceforge. This is both to save hard drive space on the server, since i don't own it, and to allow external downloads. Any help is welcome but no comments on the fact i don't know to much about ant scripts.
Also something related by a bit separate.The jar file i'm building depends on a another jar file with its own version. i also want to make a new folder each time it uploads with a different dependency version. This way the users that download this file can easily understand the main jar version it goes with while allowing me to upload 20+ sub builds.
There are several ways to upload files, so there are several kind of ant tasks to do the job.
For instance, if you want to upload to sourceforge, you can use the Ant task scp. But it seems also possible to upload there via FTP: so here is the task ftp.
Maybe you find some other service which requires you to upload via HTTP: ant-contrib have the post task.
I used to do publications as part of my ANT build logic, creating a special "publish" target that issued the scp or ftp command.Now I'm more inclined to leverage one of the publish over plugins for Jenkins.
The main reason for this shift is the management of access credentials. Using the ANT based approach I was forced to run my build on a Jenkins slave that was pre-configured with the correct SSH key to talk to the remote server. The Jenkins plugin manages private keys centrally and ensures all slaves are properly configured.
Finally if your build has dependencies on 3rd party jars, use a dependency manager like ivy to download them and include them in your project. It then becomes trivial to include their upload as part of your publish step.

My GAE python development datastore is never persisted to a file

I have just started using GAE (Python 2.7 SDK 1.6.4) , I have set up a
simple test project using Pydev (latest version) in eclipse (indigo)
on Windows XP (SP3).
It all works fine, my app can record data in the datastore and the blobstore
and then retrieve it, but when I stop the development server and start
it again the data in the datastore is lost. This is not the case for
the blobstore which is retaining blobs fine and I can see the
blobstore folder that gets created in C:\Temp
I did the sensible thing and look back through old posts and found
that most people who have this problem solve it by changing the
location of the datastore file, so I used the following parameters;
--datastore_path="${workspace_loc}/myproject/datastore"
--blobstore_path="${workspace_loc}/myproject/blobstore"
"${workspace_loc}/myproject/src"
I moved the blobstore at the same time as you can see.
The blobstore still works, and now the blobstore folder is created in
myproject folder as expected. The datastore file is still not created
however, and when I stop and restart the development server the data
is still lost.
The dev server startup logs include the following entry
WARNING 2012-04-20 10:49:04,513 datastore_file_stub.py:513] Could not
read datastore data from C:\myworkspace\myproject\datastore
So I know it is trying to create the datastore in the correct place.
Finally I lifted the whole eclipse workspace folder and copied it to
another computer with exactly the same setup except it is running
Windows 7 instead of Windows XP.
Everything works fine there - both the datastore file and blobstore
folder are now created where I expect them to be.
I have set up eclipse, python, gae, my project and my eclipse launch
file in exactly the same way on two computers, it works on one and
not the other. Maybe XP is something to do with it but to be honest I
think that's unlikely.
The only other clue I have come up with is that a recent change to the
GAE development server stopped writing to the datastore file after
every change and only flushes on exit, this problem may be closely related to mine;
App Engine local datastore content does not persist
However adding the following to my code did not help at all.
from google.appengine.tools import dev_appserver
import atexit
atexit.register(dev_appserver.TearDownStubs)
So it's not down to incorrect termination sequence either as far as I
can tell although it may be that I was just added it in the wrong place (I'm am new to python).
Anyway I am stumped and I would be really grateful for suggestions you
guys can come up with.
It's probably http://code.google.com/p/googleappengine/issues/detail?id=7244 and a bug. Hopefully a fix will be available soon.
did you try:
--storage_path=...
Path at which all local files (such as the Datastore, Blobstore files, Google Cloud Storage Files, logs, etc) will be stored, unless overridden by --datastore_path, --blobstore_path, --logs_path, etc.
found at https://developers.google.com/appengine/docs/python/tools/devserver?csw=1

File sharing not working on iPad

I am trying to get file sharing to work on my app in iPad. I've added UIFileSharingEnabled to the plist, I've queried the documents dir and written a file there (and verified it is there by looking at it through Organizer). iTunes just refuses to display the file sharing info in the device's apps tab. Does anyone know of any obscure step I might be missing, or some edge case that might be causing my problem? I've tried all the rebooting, clean building avenues.
Try running a sync of your iPad to iTunes. That did it for me.
Here's what fixed this for me: I built an ad-hoc distribution and installed it on my iPad. I believe once the app was installed/recognized as an actual "app" in itunes (versus something that shows up as a function of building through xcode), it was able to handle recognizing it as being able to share documents.

Resources