How do I verify if my USPS IMpb barcode is correct? - usps

I'm writing software using the USPS Intelligent Package Barcode (IMpb)
I have to send my labels to DHL for verification, but to avoid embarassment I'd like to try to verify them myself - at least to make sure things like checksum are correct and that they're the correct length.
Are there any tools, or verification 'snippets' to do that?

To verify check digit
Just google the number - if it is a valid USPS package it will show up in the results.
If I change the last digit (check digit) from 6 to 7 then there are no results.
You can also see it's stripped off the beginning 42054935 which is the application identifier + zipcode.
To verify barcode itself
You can upload an image to this online barcode reader and it will parse the barcode for you.
I'm not sure if it will give false positives if you don't have all the start and stop digits correct but it will show you that you've got the FNC1 digit correct. You can even upload a PDF.

Related

How to encode FNC1 starting character to make GS1 Datamatrix?

I have made string for GS1 Datamatrix
è010506060985000521sn1234567890ab 1002TRIAL003 17200228
ASCII 232
(01) Product Code (aka GTIN)
(21) Serial Number
ASCII 29 (aka aka Group Separator)
(10) Lot/Batch
ASCII 29 (aka aka Group Separator)
(17) Expiry Date
I am passing this string to Dev express Control – symbology as Datamatrix and compatible mode as ASCII.
This barcode scan correctly click here to view barcode as GS1 Datamatrix, but when I sent this string to our printing person in China, he did printed but when I am scanning his barcode getting error “Unknown encoding”.
I thing their system is not able to encode ASCII 232 – “è”.
Is any alternate way?
I am just replacing FNC 1 Start changer ASCII 232 to ASCII 29, is it correct way? click here to view barcode Is it GS1 Datamatrix?
(I just scan that in one mobile app in that it comes as GS1 Datamatrix but when did I scan into another app it just come as Datamatrix)
I want to achieve GS1 Datamatrix...
Thanks
this issue is totally dependant on the hardware used. The way to indicate FNC1 character may differ between printer family/type. Do you have info on which one is used in your case?
First, your printer partner should check himself the label he's creating (there is a GS1 app easy to use on smartphone to do that), so he can directly see if the expected information are present and well encoded.
Then, you should check which printer type he is using and which software is used to create the printer mask/job. I know lots of people are using NiceLabel for example, but I remember some issues can be found on the FNC1 character is you are using some recent Zebra printer for example. This is something the printer SAV can probably help with if it's something similar.
[EDIT]:In case of doubt this can help but you probably have it already.
Based on what you said, your part is acting like a scanner, so check chapter 2.2.1 => Important: In accordance with ISO/IEC 15424 - Data Carrier Identifiers (including
Symbology Identifiers), the Symbology Identifier is the first three characters transmitted by
the scanner indicating symbology type. For a GS1 DataMatrix the symbology identifier is ]d2

How to tell Watson conversation to not recognize strings as numbers

I'm facing a strange issue with IBM Watson Conversation when capturing numbers in Spanish language:
In Spanish when you write (or say), "please give me an answer" (por favor, dame una respuesta) or "I want to talk with a professional" (quiero hablar con un profesional), Watson recognize the words "una" and "un" as a number. Yes, it is a number (the number 1) but in these phrases they do not have the meaning of a number, they work as an article.
Do you know how to tell Watson to not recognize strings as numbers? I have been thinking about patterns but the numbers can have different length.
According to the Official documentation, the #sys-number system entity detects numbers that are written using either numerals or words. In either case, a numeric value is returned.
When you enable the System Entity #sys-number, this entity always tries to detect if the user typed some number, there are the recognized formats:
21
twenty one (in your case, works with un, una, etc)
3.13
You can see this table showing how to use this entity with other examples:
So, Watson will recognize these values (un, una) like one number, and currently don't have exceptions or configuration for does not recognize something, like your example, the word typed by the user.
If you want for some purpose to send to the user the 'una' or 'un'(literal format example), just add in your conversation response:
The number is #sys-number.literal
And the return in the bot will be:
The number is un?
See more about #sys-number System entity.
See more about System entities.

Reading specified lines below a found string

I'm going to create a simple database bank account in C-language but I haven't quite figured out how I'm gonna fetch data for a specific account already created and sent to a file. I was thinking of doing a search from the beginning of the file using fseek for an account number specified since all account numbers will be unique. Is there a way to read the the amount of lines specified below that account number once it is found? For e.g in my file accounts.txt there will be the accounts
Account # : 13398
First Name : Eric
Last Name : Walters
Parish : St.tofu
Year of Birth : 1980
Age : 34
Savings Period : 5 year(s)
Password : Eric1
Account # : 13398
Account balance: $0.00
====================================
I want to search through the file for the account number and fetch it along with everything else 10 lines below it and display it on the screen if this is possible then say 'aye' and point me to a certain area I should study to achieve this and when I'm successful i'll post my coding here to show what I have done.
fseek() allows you to skip a certain number of bytes in each file. If your lines are not always the same length, you will have to read the entire file, not just to search for the account numbers, but also to find the ten newlines that delimit each account. To do this, you are better off using fgets().
The steps would be something like this
foreach line in file
if line starts with "Account Number"
if the number is the one you want
print the next 10 lines
else
skip the next 10 lines
else
keep looking
Firstly, fseek is used to move the file pointer not for searching. For search text, i.e. account id in your case, there is some examples Trying to find and replace a string from file in C. To write your own code, learning the basic use of file handling functions is enough. Furthermore, since your data is structured (every 11 lines represent one account), you code can be accelarated. At last, what you are trying to do is what database software offers and it is hard too implement your own database as fast as commercial software.
You could search in the file, but that would be a bit tedious. Even more tedious if you wanted to modify the account details.
Why don't you use SQLite:
It is designed to replace fopen().
?

PDFBox adding white spaces within words

When I try to extract text from my PDF files, it seems to insert white spaces between severl words randomly.
I am using pdfbox-app-1.6.0.jar (latest version) on following sample file in Downloads section of this page :
http://www.sheffield.gov.uk/roads/children/parents/6-11/pedestrian-training
I've tried with several other PDF files and it seems to be doing same on several pages.
I do the following:
java -jar pdfbox-app-1.6.0.jar ExtractText -force -console ~/Desktop/ped training pdf.pdf
on the downloaded file and you will see spaces in following inserted wrongly in the result on console:
"• If ch ildren are able to walk to
schoo l safely this could reduce the
congestion. "
"• Develops good hab its for later life."
"www.sheff ield.gov.uk"
"Think Ahead!, wh ich is based on the"
etc etc.
As you can see several of words above have spaces between them for no reason I can fathom.
I am on ubuntu and running Sun's JDK 1.6.
I've tried this on several different PDF files and tried searching for solution on forums, there were similar bugs but all seemed to have been resolved.
Any help or if anyone else has same problem please comment. This is causing big problem in indexing the content properly for searching.
Unfortunately there is currently no easy solution for this.
Internally PDF documents simply contain instructions like "place characters 'abc' in position X" and "place characters 'def' in position Y", and PDFBox tries to reason whether the resulting extracted text should be "abc def" or "abcdef" based on things like the distance between X and Y. These heuristics are generally pretty accurate, but as you can see they don't always produce the correct result.
One way to improve the quality of the extracted text is to try a dictionary lookup on each extracted word or token. If the lookup fails, try combining the token with the next one. If a dictionary lookup on the combined token succeeds, then it's fairly likely that the text extractor has mistakenly added an extra space inside the word. Unfortunately such a feature does not yet exist in PDFBox. See https://issues.apache.org/jira/browse/PDFBOX-1153 for the feature request filed for this. Patches welcome!
The class org.apache.pdfbox.util.PDFTextStripper (pdfbox-1.7.1) allows to modify the propensity to decide if two strings are part of the same word or not.
Increasing spacingTolerance will reduce the number of inserted spaces.
/**
* Set the space width-based tolerance value that is used
* to estimate where spaces in text should be added. Note that the
* default value for this has been determined from trial and error.
* Setting this value larger will reduce the number of spaces added.
*
* #param spacingToleranceValue tolerance / scaling factor to use
*/
public void setSpacingTolerance(float spacingToleranceValue) {
this.spacingTolerance = spacingToleranceValue;
}

Twitter name length in DB

I'm adding a field to a member table for twitter names for members on a site. From what I can work out the maximum twitter name length is 20 so it seems obvious that I should set the field size to varchar(20) (SQL Server).
Is this a good idea?
What if Twitter starts allowing multi-byte characters in the user names? Should I make this field nvarchar?
What if Twitter decides to increase the size of a username? Should I make it 50 instead and then warn a user if they enter a name longer than 20?
I'm trying to code defensively so that I can reduce the chances of modifying the code around this input field and the DB schema changes that might be needed.
while looking for the same info i found the following in a sort of weird place in the twitter help section (why not in the API docs? who knows?):
"Your user name can contain up to 15 characters. Why no more? Because we append your user name to your 140 characters on outgoing SMS updates and IM messages. If your name is longer than 15 characters, your message would be too long to send in a single text message."
http://help.twitter.com/entries/14609-how-to-change-your-username
so perhaps one could even get away with varchar(16)
While new accounts has a limit of 15 characters in the username and 20 characters in the name, for old accounts this limit seems to be undefined. The documentation here states:
Earlybirds: Early users of Twitter may have a username or real name longer than user names we currently allow. This is ok until you need to save changes to your account settings. No changes will save unless your user/real name is the appropriate length; this means you have to change your real name/username to meet our most modern regulations.
So you are probably better of having a long field and save yourself some time when you hit the border cases.
Nowadays, space is usually not a concern, so I'd use a mostly generic approach: use nvarchar(200).
When designing DB schemas you must think 2 steps ahead, even more than when programming. Or get yourself a good schema update strategy, then you'll be fine also with varchar(20).
Personally I wouldn't worry. Use something like 200 (or a nice round number like 256) and you won't have this problem. The limit then is on their API, so you might be best to do some verification that it is a real username anyway. That verification implicitly includes the length checking.
Twitter allows for 140 characters to be typed in as the message payload for transmission, and includes "[username]:" at the beginning of the SMS message. With an upper limit of 140 characters for the message combined with the messaging system being based on SMS, I think they would have to decrease the allowable message size to increase the username. I think it is a pretty safe bet that 20 characters would be the max username length. I'd use nvarchar just in case someone uses 16-bit characters, and maybe pad it a little. nvarchar(24) should work; I wouldn't go any higher than nvarchar(32).
If you're going to develop an app for their service, you should probably watch the messages on Twitter's API Announcements mailing list.
[opinion only]
Twitter works on SMS and the limit there is something like 256 characters, so the name has to be small to avoid hitting into the message.
nvarchar would be a good idea for all twitter text
If the real ID of a Twitterer is a cell-phone then the longest phone number is your max - 20 should easily cover it!
Defensive programming is always good :) !
[/opinion only]
There's only so much you can code defensively, I'd suggest looking at the twitter API documentation and following anything specified there. That said, from a cursory look through nowhere seems to specify the length of the username, annoyingly :/
One thing to keep in mind here is that a field using nvarchar needs twice as much space, since it needs 2 bytes to store each potential unicode character. So, a twitter status would need a size of 280 using nvarchar, PLUS some more for possible retweets, as those aren't inlcuded in the 140 char limit. I discovered this just today in fact!
For example:
RT #chatrbyte: here's some great tweet
that I'm retweeting.
The RT #chatrbyte: is not included in the 140 character limit.
So, assuming that a Twitter username has a 20 character limit, and wanting to also capture a ReTweet, a field to hold a full tweet would need to be a nvarchar of size 280 + 40 (for the username) + 8 (for the initial RT # before a retweet) +4 (for the :+space after a Retweet username) = 330.
I would say go for nvarchar(350) to give yourself a little room. That's what I am trying right now. If I'm wrong I'll update here.
I'm guessing you are managing the data entry on the Twitter name field in your application somewhere other than just in the database. If you open the field to 200 characters, you only have to change the code in one place or if you allow users to enter Twitters names with more than 20 characters, you don't have to worry about a change at all.

Resources