I need to serve PDF files stored in Google Cloud Storage.
I tried:
from google.appengine.api import blobstore
from google.appengine.api import images
bkey = blobstore.create_gs_key('/gs/bucket/pdfobject')
url = images.get_serving_url(bkey)
This works well in the dev server, but when deployed to GAE, gives this error:
get_serving_url_hook\n raise _ToImagesError(e, readable_blob_key)\n', 'TransformationError\n']
I also tried:
url = images.get_serving_url(blob_key=None, secure_url=True, filename='/gs/bucket/pdfobject')
Same problem. Works in dev but TransformationError when deployed.
Related
I need to serve PDF files stored in Google Cloud Storage.
I tried:
from google.appengine.api import blobstore
from google.appengine.api import images
bkey = blobstore.create_gs_key('/gs' + filename)
url = images.get_serving_url(bkey)
Error:
get_serving_url_hook\n raise _ToImagesError(e, readable_blob_key)\n', 'TransformationError\n']
You are treating the PDF file as if it wan image. You cannot use the 'images' api with a pdf file. There are several ways of storing and serving static files, which you can find from this link [1]
[1] https://cloud.google.com/appengine/docs/standard/python3/serving-static-files
I have a Scala web application running in GAE. I need to use a Java library -JWI- which requires me to pass a root folder of Wordnet into edu.mit.jwi.Dictionary's constructor.
I thought about putting all Wordnet stuff into Google Cloud Storage, but it doesn't have a concept of a folder at all. So, my question: is there any way to do what I want with Google Cloud Storage or should I use anything else?
You were right when you stated “there is no API in Google Cloud Java library for folder manipulation”. As of today, there’s no folder manipulation for the java client library. You can check the library here
You can use Google Cloud Storage (GCS), even if gsutil handles subdirectories in a different way, because it behaves as a regular folder and uses the same notation.
I am not sure about how your application works but if I am guessing well:
Load the JWI library to your Cloud Shell.
Import the library in your Scala application in App Engine flexible. Find an example here on how to call a Java class using Scala.
Deploy the application. Following the previous steps, the image deployed will contain the JWI library you need.
Load the Wordnet semantic dictionary in a bucket and pass the root folder of Wordnet, in this case a GCS folder, using the Java client library for the Google Cloud Storage API. The “Dictionary” must be downloaded (using a get function) and locally stored while you are using it.
Find here the Java client library documentation for Cloud Storage. You might need more functions than the ones below which I have written for you, to create a bucket, upload a file and download it.
package com.example.storage;
// Imports the Google Cloud client library
import com.google.cloud.storage.Acl;
import com.google.cloud.storage.Acl.Role;
import com.google.cloud.storage.Acl.User;
import com.google.cloud.storage.Bucket;
import com.google.cloud.storage.BucketInfo;
import com.google.cloud.storage.Blob;
import com.google.cloud.storage.BlobId;
import com.google.cloud.storage.BlobInfo;
import com.google.cloud.storage.Storage;
import com.google.cloud.storage.StorageOptions;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
public class QuickstartSample {
public static void main(String... args) throws Exception {
// Instantiates a client
Storage storage = StorageOptions.getDefaultInstance().getService();
// The name for the new bucket
String bucketName = args[0]; // "my-new-bucket";
// Creates the new bucket
Bucket bucket = storage.create(BucketInfo.of(bucketName));
System.out.printf("Bucket %s created.%n", bucket.getName());
// [START uploadFile]
// Object name
String fileName="filename.ext";
// Create file inside the bucket
BlobInfo blobInfo =
storage.create(
BlobInfo
.newBuilder(bucketName, fileName)
// Modify access list to allow all users with link to read file
.setAcl(new ArrayList<>(Arrays.asList(Acl.of(User.ofAllUsers(), Role.READER))))
.build()
// other options required
);
// return the public download link
blobInfo.getMediaLink();
// [END uploadFile]
// Copy file from a bucket
String blobName = "filename.ext";
BlobId blobId = BlobId.of(bucketName, blobName);
Blob blob = storage.get(blobId);
}
Finally, find here how to compile the code and running it:
mvn clean package -DskipTests
mvn exec:java -Dexec.mainClass=com.example.storage.QuickstartSample -Dexec.args="bucketName"
Could someone help me access Big Query from an App Engine application ?
I have completed the following steps -
Created an App Engine project.
Installed google-api-client, oauth2client dependencies (etc) into /lib.
Enabled the Big Query API for the App Engine project via the cloud console.
Created some 'Application Default Credentials' (a 'Service Account Key') [JSON] and saved it/them to the root of the App Engine application.
Created a 'Big Query Service Resource' as per the following -
def get_bigquery_service():
from googleapiclient.discovery import build
from oauth2client.client import GoogleCredentials
credentials=GoogleCredentials.get_application_default()
bigquery_service=build('bigquery', 'v2', credentials=credentials)
return bigquery_service
Verified that the resource exists -
<googleapiclient.discovery.Resource object at 0x7fe758496090>
Tried to query the resource with the following (ProjectId is the short name of the App Engine application) -
bigquery=get_bigquery_service()
bigquery.tables().list(projectId=#{ProjectId},
datasetId=#{DatasetId}).execute()
Returns the following -
<HttpError 401 when requesting https://www.googleapis.com/bigquery/v2/projects/#{ProjectId}/datasets/#{DatasetId}/tables?alt=json returned "Invalid Credentials">
Any ideas as to steps I might have wrong or be missing here ? The whole auth process seems a nightmare, quite at odds with the App Engine/PaaS ease-of-use ethos :-(
Thank you.
OK so despite being a Google Cloud fan in general, this is definitely the worst thing I have been unfortunate enough to have to work on in a while. Poor/inconsistent/nonexistent documentation, complexity, bugs etc. Avoid if you can!
1) Ensure your App Engine 'Default Service Account' exists
https://console.cloud.google.com/apis/dashboard?project=XXX&duration=PTH1
You get the option to create the Default Service Account only if it doesn't already exist. If you've deleted it by accident you will need a new project; you can't recreate it.
How to recover Google App Engine's "default service account"
You should probably create the default set of JSON credentials, but you won't need to include them as part of your project.
You shouldn't need to create any other Service Accounts, for Big Query or otherwise.
2) Install google-api-python-client and apply fix
pip install -t lib google-api-python-client
Assuming this installs oath2client 3.0.x, then on testing you'll get the following complaint:
File "~/oauth2client/client.py", line 1392, in _get_well_known_file
default_config_dir = os.path.join(os.path.expanduser('~'),
File "/usr/lib/python2.7/posixpath.py", line 268, in expanduser
import pwd
File "~/google_appengine-1.9.40/google/appengine/tools/devappserver2/python/sandbox.py", line 963, in load_module
raise ImportError('No module named %s' % fullname)
ImportError: No module named pwd
which you can fix by changing ~/oauth2client/client.py [line 1392] from:
os.path.expanduser('~')
to:
os.env("HOME")
and adding the following to app.yaml:
env_variables:
HOME: '/tmp'
Ugly but works.
3) Download GCloud SDK and login from console
https://cloud.google.com/sdk/
gcloud auth login
The issue here is that App Engine's dev_appserver.py doesn't include any Big Query replication (natch); so when you're interacting with Big Query tables it's the production data you're playing with; you need to login to get access.
Obvious in retrospect, but poorly documented.
4) Enable Big Query API in App Engine console; create a Big Query ProjectID
https://console.cloud.google.com/apis/dashboard?project=XXX&duration=PTH1
https://bigquery.cloud.google.com/welcome/XXX
5) Test
from oauth2client.client import GoogleCredentials
credentials=GoogleCredentials.get_application_default()
from googleapiclient.discovery import build
bigquery=build('bigquery', 'v2', credentials=credentials)
print bigquery.datasets().list(projectId=#{ProjectId}).execute()
[or similar]
Good luck!
I'm new to Google App Engine and Python. I've almost completed a project, but can't get the get_serving_url() function to work. I've stripped everything down to the most basic functionality, following the documentation. And yet I still get a 500 error from the server. Any thoughts? Here is the code:
from google.appengine.api import images
....
class Team(db.Model):
avatar = db.BlobProperty()
....
def to_dict(self):
....
image_url = images.get_serving_url(self.avatar.key())
The last line is the problem...commenting it out makes the app run fine. But it is copied almost directly from the documentation. I should note that I can download the avatar blob directly with:
class GetTeamAvatar(webapp2.RequestHandler):
def post(self):
team_id = self.request.get('team_id')
team = Team.get_by_id(long(team_id))
self.response.write(team.avatar)
So I know it is stored correctly. I do not have PIL on my machine...is that the issue? The datastore's image API says it has PIL locally so if I'm deploying my app it shouldn't matter, right? I have Python 3.3 and apparently PIL stopped at 2.6.
Python appengine run time is 2.7, (OK and 2.5) so don't even try to work with 3.x.
Secondly get_serving_URL is a method you call with a BlobStore entity key not a BlobProperty.
You are confusing two different things here.
I would concentrate on getting your code to run locally correctly under 2.7 first, and PIL is available for 2.7.
I'm very impressed if you're trying to deploy your app without even testing it locally.
One thing you'll need to do is make PIL available in your app.yaml via the libraries attribute.
I have used lxml on Google App Engine to scrape some basic data.
It works fine with the SDK. When I try to use it on the appengine servers I get.
IOError: Error reading file 'http://www.google.com': failed to load external entity "http://www.google.com"
My code looks like;
import lxml.html
url = "http://www.google.com"
t = lxml.html.parse(url)
pagetitle = t.find.(".//title").text
self.response.out.write(pagetitle)
edit:
I ended up having to make a small change to handle as is outlined in the answer below.
from google.appengine.api import urlfetch
result = urlfetch.fetch(url)
t = lxml.html.fromstring(result.content)
GAE does not support opening sockets, you should use urlfetch.fetch() to get the page contents, then feed it to the parser.