We use Wizdler to get data. But we have to get data about 1 million person.
There is an alternative to search for only one or multiple.
We always writing ID here, one by one.
Our questions is, is there anyway to input 1 million number into here in one time?
We have also a list over all ID numbers too. Is that possible to input link to our files that has all numbers or we must write all number like that?
<int xmlns="http://schemas.microsoft.com/Arrays">123456</int>
<int xmlns="http://schemas.microsoft.com/Arrays">654321</int>
I have all Combination saved as text file. Looking like that.
0000000
0000001
0000002
0000003
etc...
It's not possible to use a link to your file in that SOAP request. The SOAP request has to contain all the identifiers you want to retrieve. Also, one million of rows will be presumably too much for that web service and you will need to split them to multiple chunks.
To create a SOAP request with all the numbers, you can use a command to generate all the int elements. Then you can just use any text editor to wrap it into a SOAP envelope. This will give you the request you can send to the web service.
In Linux environment, the command to generate the lines will look like this (identifiers.txt is the name of your file with all the identifiers, a is the namespace alias for http://schemas.microsoft.com/Arrays that has to be defined beforehand):
awk '$0="<a:int>"$0"</a:int>"' identifiers.txt
The result will look like this:
<a:int>0000000</a:int>
<a:int>0000001</a:int>
<a:int>0000002</a:int>
<a:int>0000003</a:int>
You can also generate the whole request, not only the repetitive part of it. The following example assumes that operation name is yourRequest and it is in the namespace yourNamespace. The int elements are also not wrapped in any other element. You have to alter this to match your scenario:
awk 'BEGIN{print "<Envelope xmlns=\"http://schemas.xmlsoap.org/soap/envelope/\">\n\t<Body>\n\t\t<yourRequest xmlns=\"yourNamespace\" xmlns:a=\"http://schemas.microsoft.com/Arrays\">"}{print "\t\t\t<a:int>"$0"</a:int>"}END{print "\t\t</yourRequest>\n\t</Body>\n</Envelope>"}' identifiers.txt > request.xml
After executing the previous command, the request.xml will have the following content:
<Envelope xmlns="http://schemas.xmlsoap.org/soap/envelope/">
<Body>
<yourRequest xmlns="yourNamespace" xmlns:a="http://schemas.microsoft.com/Arrays">
<a:int>0000000</a:int>
<a:int>0000001</a:int>
<a:int>0000002</a:int>
<a:int>0000003</a:int>
</yourRequest>
</Body>
</Envelope>
For the SOAP envelope, I have used the namespace for SOAP 1.1 (http://schemas.xmlsoap.org/soap/envelope/). If your service understands SOAP 1.2 only, change the namespace accordingly (http://www.w3.org/2003/05/soap-envelope).
To call the web service, you can use curl:
curl -d "#request.xml" "http://url/to/your/web/service" -H "Content-Type: text/xml"
Add SOAPAction HTTP header if your web service requires it. Also change the content type from text/xml for SOAP 1.1 to application/soap+xml for SOAP 1.2.
With all that being said, the final request with million numbers will have approximately 18 MB at least. Such huge request will most likely fail because of a maximum POST size limit or a timeout. To work around this, split the request into smaller requests.
Related
I am trying to send data to Devo with HTTP requests. The request works and 204 status is received. I am following Devo docs where this method is explained.
I can search the data in Devo but it is contained in unknown.unknown table. What am I doing wrong?
http://http-us.logtrust.io/event/MY_DOMAIN/token!MY_TOKEN/local1/XXXX.test.example?hello_world_message
Previously I created the token with target table: XXXX.test.* (where XXXX is the name of the project).
After spending some time I found the answer in one of the Devo docs:
https://docs.devo.com/confluence/ndt/supported-technologies/special-devo-tags-and-data-tables
The problem was I was using "XXXX.test" as tag when first and second positions of the tags define the technology to parse the data. In this case, it is a custom application so it is necessary to use "my.app".
Before execute request it was necessary to create a new token for the new tags to use: "my.app.test.*"
Now I can visualize the data in Devo.
I am writing an apache module output filter that needs to consume a couple of internal-only response headers. These response headers are set by a perl based application running in the backend. The APR function I am using in my output filter is:
apr_table_get(r->headers_out, "x-my-response-header");
However, what seems to happen is that in my output filter I do not see the above response header set, up until the third or fourth bucket brigade - which is unfortunately already too late - I actually need to use the value of x-my-response-header to compute a new response header and set that in the response to the browser.
I insert my output filter this way:
ap_hook_insert_filter(insertOutputFilterHook, NULL, NULL, APR_HOOK_FIRST);
ap_register_output_filter(myFiltersName, myOutputFilter, NULL, AP_FTYPE_CONTENT_SET);
What I have verified:
The internal-only headers do appear on the HTTP response on my browser (haven't unset them yet)
The first two bucket brigade's buckets contain html page text
Questions:
What could be the reasons for why the internal-only response header not be set/visible in the first call to my output filter / first bucket brigade?
Is it possible to instead accumulate the first few bucket brigades and then start flushing them out once the internal only response header's value known?
When I'm using google.appengine.api.urlfetch.fetch (or the asynchronous variant with make_rpc) to fetch a URL that steadily streams data, after a while I will get a google.appengine.api.urlfetch_errors.DeadlineExceededError as expected. Since it is a stream that I want to sample, setting the deadline to a higher value can't ever help, unless the stream finishes (which I do not expect to happen).
It seems there is no possibility of getting the partially downloaded result. At least the API doesn't offer anything. Is it possible to
either request the downloaded part
or only ask for a certain amount of data (since I can estimate the stream's rate) to be downloaded?
[Clarification: Since it is a stream, requests with a Range header will be answered with 200 OK and not 206 Partial Content.]
In your call to urlfetch.fetch, you can set HTTP headers. The Range header is how you specify a partial-download request in HTTP:
resp = urlfetch.fetch(
url=whatever,
headers={'Range': 'bytes=100-199'})
if those are the 100 bytes you want. The HTTP status code you get should be 206 for such a partial download, etc (none of that's GAE-specific). See e.g http://en.wikipedia.org/wiki/Byte_serving for details.
I'm trying to simply delete a few cards that were created by my app. However, it appears as though the list() method cycles through every single card in the entire user's timeline.
My code below is slightly modified from the example in the documentation under the timeline list. When I attempted to use this, it accidentally looped through every card in my timeline using up my entire 1,000 / day quota in just a few seconds before the operation timed out.
def delete_previous_cards(self):
"""
This cleans up any cards that may have been leftover.
"""
result = []
request = self.mirror_service.timeline().list()
while request:
try:
timeline_items = request.execute()
result.extend(timeline_items.get('items', []))
request = self.mirror_service.timeline().list_next(request, timeline_items)
except errors.HttpError, error:
print 'An error occurred: %s' % error
break
for item in result:
item_id = item['id']
self.mirror_service.timeline().delete(id=item_id).execute()
What's the best way to efficiently delete the cards created by my app?
There's a JavaScript based tool that an Explorer wrote for just this purpose: Glass Cleaner.
It looks to me like the Python example is missing any concept of a pageToken, most of the other language examples have a nextPageToken and loop until the response does not have a nextPageToken. If you keep requesting the first page over and over even if you only have three cards you will quickly exhaust your API quota.
The rest of this answer is general information about list and delete and some curl commands you can safely experiment with that won't loop and exhaust your quota quite as quickly. Make special note of the nextPageToken property in the returned JSON from list commands ...
LIST and DELETE are weird and don't follow the documentation exactly in my experience.
Here is a sample CURL command for List.
curl -x http://localhost:5671 -H "Authorization: Bearer YOUR_TOKEN_HERE"
https://www.googleapis.com/mirror/v1/timeline
It returns 10 items for the user and app that are associated with the token.
It includes deleted items (isDeleted set to true) but does not show the isDeleted property in the output JSON. This is weird.
If you modify it slightly:
curl -x http://localhost:5671 -H "Authorization: Bearer YOUR_TOKEN_HERE"
https://www.googleapis.com/mirror/v1/timeline?isDeleted=true
(note the trailing parameter) now you get the same list but the output JSON includes the isDeleted property. The lesson for me here is you should probably be requesting isDeleted=false for looping delete requests.
To delete an item you can do this:
curl -x http://localhost:5671 -H "Authorization: Bearer YOUR_TOKEN_HERE"
-H "Content-Type: application/json" -v -X DELETE
https://www.googleapis.com/mirror/v1/timeline/ID_OF_A_TIMELINE_CARD
Note you have to use an actual id from a card you got from a list command at the end. Grab one from a list command above.
When you do a successful DELETE the response is a 204, which in a RESTful world can indicate delete success.
Then if you do a subsequent list as in the first example above the item will come right back and not be marked as deleted because the isDeleted property is missing.
Pages seem to be 10 in size, but I guess that could change, since I didn't find that documented anywhere.
nextPageToken values seem to frequently have identical beginnings and ends, and they are very long strings, so it can be confusing to look at them and you might inadvertently think they are identical when they are not, lesson here is to compare very carefully in the middle.
Maybe those curl commands help you experiment when your API quota comes back, and I would experiment with testing for a null or empty string nextPageToken to tell you when to exit your loop. The equivalent java code is:
} while (request.getPageToken() != null && request.getPageToken().length() > 0);
Good luck, and great question.
I'm using Twilio to send sms's with appengine. Twilio doesn't accept sms's longer than 160 characters so I have to split them. I am splitting the sms's and sending them as follows:
def send_sms_via_twilio(mobile_number, message_text):
client = TwilioRestClient(twilio_account_sid , twilio_auth_token)
message = client.sms.messages.create(to=mobile_number, from_=my_twilio_number, body=message_text)
split_list = split_sms(long_message)
for each_message in split_list:
send_sms_via_twilio(each_message)
However I found that the order of sending varied. For example sometimes I'd recieve message 2/5 then 1/5 then 4/5 etc and other times the order would be correct. The order of the split_list is definately correct. To overcome the incorrect order of the sms's I tried
for each_message in split_list:
deferred.defer(send_sms_via_twilio, each_message, _countdown=1)
However I encountered the same problem. I then tried
for each_message in split_list:
deferred.defer(send_sms_via_twilio, each_message, _countdown=1, _queue="send-text-message")
and defined my queue as
- name: send-text-message
rate: 1/s
bucket_size: 10
max_concurrent_requests: 1
retry_parameters:
task_retry_limit: 5
Thinking that the issue was concurrency (running in python27) and that if I limited max_concurrent_requests this issue would be solved. However the issue is still present i.e. the texts still get sent in the wrong order. I checked the logs but couldnt see any notification of task failure - they just seem to be executing in the wrong order.
Is there something I am missing? How can I fix this issue.
Note that the SMS messaging (specifically the underlying protocols like SMPP) are asynchronous by definition. It means there is no way you can specify the order of distinct SMS messages.
There is a way to specify the order of SMS packets by using the UDH (user defined headers) in the binary body of those messages. But this works only for long SMS messages -- those that are too long to be sent in one message. For example, if your msg exceeds 160 GSM-7 characters or 80 UTF-16 characters it will be send as more than one message with UDH.
In that case the mobile phone won't show message parts as they arrive. It will collect them in memory until the last one comes and then assembles them in the right order. For the end user this is just a message longer than usual and you don't have to write "1/3", "2/3", ... in the message.
Disclaimer: I work for a company that enables you to send and receive both multiple binary messages with user-specified headers (UDH) and/or standard long messages.
If you are not tied to Twilio try using SMSified. They automatically split the message for you, insure it is in the correct order, and add "1/2, 2/2..." to the end of the message. In other words you just send the complete message to their REST API, no matter the length, and they handle the rest. Since they also use a REST API you can continue to use Python.