How to get array from non-array field for INSERT command?

How to get array from non-array field for INSERT command? - arrays

I'm trying to use a SELECT statement together with a INSERT INTO command. Everything would work fine, if there wasn't a small problem: some fields of the table are defined as real[] but my input is numeric. Thus, the question:
Is there a function in PostgreSQL to create out of the single numeric input an array of type real (with just one element)?
My setting looks like this:
tempLogTable(..., logValue NUMERIC, ...)
finalLogTable(..., logValues REAL[], ...)
The idea is to insert the tuples from the tempLogTable to the finalLogTable using INSERT INTO ... SELECT .... Unfortunately, because of various reasons the data types are given and I would not like to change these for the moment (not to break anything).
I'm using PostgreSQL 9.2.

SELECT ARRAY[thenumeric::real] FROM the_table;
or
SELECT ARRAY[thenumeric]::real[] FROM the_table;
They're not really any different for a one-element array.
real has limits that numeric doesn't. In particular, comparing real values for equality doesn't work reliably; you should instead compare for two numerics being different by smaller than a small (somewhat task-specific) amount. It also can't represent values as big or small as numeric can. See the floating point guide among other info on comparing floats. This will be much harder to do right when they're wrapped in arrays.
For the purpose you describe, where it sounds like you are just collecting stats or historical data, that isn't going to be a problem. It usually only turns out to be an issue where people try to write:
WHERE some_real = some_other_real
which will result in surprising and unexpected behaviour.
You should be fine with an INSERT INTO ... SELECT as described.

Related

TEXTSPLIT combined with BYROW returns an unexpected result when using an array of strings as input

I am testing the following simple case:
=LET(input, {"a,b;c,d;" ; "e,d;f,g;"},
BYROW(input, LAMBDA(item, TEXTJOIN(";",,TEXTSPLIT(item,",",";", TRUE)))))
since the TEXTJOIN is the inverse operation of TEXTSPLIT, the output should be the same as input without the last ;, but it doesn't work like that.
If I try using a range instead it works:
It works for a single string:
=LET(input, "a,b;c,d;", TEXTJOIN(";",,TEXTSPLIT(input,",",";", TRUE)))
it returns: a,b;c,d
What I am doing wrong here? I think it might be a bug. Per TEXTSPLIT documentation there is no constraint of using TEXTSPLIT combined with BYROW when using an array of strings.

Not sure if this would classify as an answer but thought I'd share my attempt at it.
I don't think the problem here is TEXTSPLIT(). I tried different things. 1st I tried to incorporate FILTERXML() to do the split, with the exact same result. For good measure:
=BYROW({"a,b;c,d;","e,d;f,g;"},LAMBDA(item,TEXTJOIN(";",,FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(item,",",";"),";","</s><s>")&"</s></t>","//s"))))
Then I tried to enforce array usage with T(IF(1,TEXTSPLIT("a,b;c,d;",{",",";"},,1))) but Excel would not budge.
The above lead me to believe the problem is in fact BYROW() itself. Even though documentation says the 1st parameter takes an array, the working with other array-functions do seem to be buggy and you could report it as such.
For what it's worth for now; you could use REDUCE() as mentioned in the comments and in the linked answer however I'd preserve that for more intricate stacking of uneven distributed columns/rows. In your case MAP() will work and is simpler than BYROW():
=LET(input, {"a,b;c,d;";"e,d;f,g;"},
MAP(input, LAMBDA(item, TEXTJOIN(";",,TEXTSPLIT(item,",",";", TRUE)))))
And to be honest, this is kind of what MAP() is designed for anyway.

SPARQL: array-strings conversion / extraction

I am doing a SPARQL Query that for one variable gives me the output "[-1.6101874126499998e-19]". This is obviously a string containing an array (that could also contain more numbers). Is there a way to access the number in it or does it need to be done in the underlying graph?
If it needs to be changed in the graph, what is a good ontological way to create multi-dimensional arrays...?

xsd:float(SUBSTR(?var,2,STRLEN(?var)-2)) does the job in this case, but it's quite hacky :/

Generate unique ID from string

I am trying to take a text string and create a unique numerical value from it and I am not having any luck.
For example, I have user names (first and last) and birthdate. I have tried taking these values and converting them to varbinary, which does give me a numerical value from the data, but it isn't unique. Out of ~700 records, I will get at least 100 numerical values that are duplicated but the text of first name, last name, and birthdate that was used to generate the number is different.
Here is some code I have been trying:
SELECT CONVERT(VARCHAR(300), CONVERT(BIGINT,(CONVERT(VARBINARY, SE.FirstName) + CONVERT(VARBINARY, SE.BirthDate) ))) FROM ELIGIBILITY SE
If I use that code and convert the following data, the result is 3530884780910457344. So the same number is generated from this unique data:
David 12/03/1952
Janice 12/23/1952
Michael 03/24/1952
Mark 12/23/1952
I am looking for some way, the simpler the better, to take these values and generate a unique numerical value from that data. And the reason why I need to use these values as input is because I am trying to avoid creating duplicates in the future as well as be able to predict the numerical value based on the formula. This is why NewID() won't work for me.

How about simply:
SELECT CHECKSUM(name, BirthDate) FROM dbo.ELIGIBILITY;
Of course, since there are still chances for collisions, maybe you should better define what you are actually trying to do. You've stated some reasons why e.g. NEWID() won't work but I still don't follow the the underlying purpose of this unique number.

How to convert a PGresult to custom data type with libpq (PostgreSQL)

I'm using the libpq library in C to accessing my PostgreSQL database. So, when I do res = PQexec(conn, "SELECT point FROM test_point3d"); I don't know how to convert the PGresult I got to my custom data type.
I know I can use the PQgetValue function, but again I don't know how to convert the returning string to my custom data type.

The best way to think about this is that data types interact with applications over a textual interfaces. Libpq returns a string from just about anything. The programmer has a responsibility to parse the string and create a data type from it. I know the author has probably abandoned the question but I am working on something similar and it is worth documenting a few important tricks here that are helpful in some cases.
Obviously if this is a C language type, with its own in and out representation, then you will have to parse the string the way you would normally.
However for arrays and tuples, the notation is basically
[open_type_identifier][csv_string][close_type_identifier]
For example a tuple may be represented as:
(35,65,1111111,f,f,2011-10-06,"2011-10-07 13:11:24.324195",186,chris,f,,,,f)
This makes it easy to parse. You can generally use existing csv processers once you trip off the first and last character. Moreover, consider:
select row('test', 'testing, inc', array['test', 'testing, inc']);
row
-------------------------------------------------
(test,"testing, inc","{test,""testing, inc""}")
(1 row)
As this shows you have standard CSV escaping inside nested attributes, so you can, in fact, determine that the third attribute is an array, and then (having undoubled the quotes), parse it as an array. In this way nested data structures can be processed in a manner roughly similar to what you might expect with a format like JSON. The trick though is that it is nested CSV.

Dropping Leading Zeros

I have a form that records a student ID number. Some of those numbers contain a leading zero. When the number gets recorded into the database it drops the leading 0.
The field is set up to only accept numbers. The length of the student ID varies.
I need the field to be recorded and displayed with the leading zero.

If you are always going to have a number of a certain length (say, it will always be 10 characters), then you can just get the length of the number in the database (after it is converted to a string) and then add the appropriate 0's.
However, if this is an arbitrary amount of leading zeros, then you will have to store the content as a string in the database so you can capture the leading zeros.

It sounds like this should be stored as string data. It sounds like the leading zeros are part of the data itself, not just part of it's formatting.
You could reformat the data for display with the leading zeros in it, however I believe you should store the correct form of the ID number, it will lead to less bugs down the road (ex: you forgot to format it in one place but not in another).

There are a few ways of doing this - depending on the answers to my comments in your question:
Store the extra data in the database by converting the datatype from numeric to varchar/string.
Advantages: Very simple in its implementation; You can treat all the values in the same way.
Disadvantage: If you've got very large amounts of data, storage sizes will escalate; indexing and sorting on strings doesn't perform so well.
Use if: Each number may have an arbitrary length (and hence number of zeros).
Don't use if: You're going to be spending a lot of time sorting data, sorting numeric strings is a pain in the ass - look up natural sorting to see some of the pitfalls;
Continue to store the data in the database as numeric but pad the numeric back to a set length (i.e. 10 as I have suggested in my example below):
Advantages: Data will index better, search better, not require such large amounts of storage if you've got large amounts of data.
Disadvantage: Every query or display of data will require every data instance to be padded to the correct length causing a slight performance hit.
Use if: All the output numbers will be the same length (i.e. including zeros they're all [for example] 10 digits); Large amounts of sorting will be necessary.
Add a field to your table to store the original length of the numeric, continue to store the value as numeric (to leverage sorting/indexing performance gains of numeric vs. string) in your new field store the length as it would include the significant zeros:
Advantages: Reduction in required storage space; maximum use of indexing; sorting of numerics is far easier than sorting text numerics; You still get the ability to pad numerics to arbitrary lengths like you have with option 1.
Disadvantages: An extra field is required in your database, so all your queries will have to pull that extra field thus potentially requiring a slight increase in resources at query/display time.
Use if: Storage space/indexing/sorting performance is any sort of concern.
Don't use if: You don't have the luxury of changing the table structure to include the extra value; This will overcomplicate already complex queries.
If I were you and I had access to modify the db structure slightly, I'd go with option 3, sure you need to pull out an extra field to get the length. The slightly increased complexity pays huge dividends in the advantages versus the disadvantages. The performance hit of padding the string back out the correct length will be far superceded by the performance increase of the indexing and storage space required.

I worked with a database with a similar problem. They were storing zip codes as a number. The consequence was that people in New Jersey couldn't use our app.
You're using data that is logically a text string and not a number. It just happens to look like a number, but you really need to treat it as text. Use a text-oriented data type, or at least create a database view that enables you to pull back a properly formatted value for this.

See here: Pad or remove leading zeroes from numbers

declare #recordNumber integer;
set #recordNumber = 93088;
declare #padZeroes integer;
set #padZeroes = 8;
select
right( replicate('0',#padZeroes)
+ convert(varchar,#recordNumber), #padZeroes);

Unless you intend on doing calculations on that ID, its probably best to store them as text/string.

Another option is since the field is an id, i would recommend creating a secondary field for display number (nvarchar) that you can use for reports, etc...
Then in your application when the student id is entered you can insert that into the database as the number, as well as the display number.

An Oracle solution
Store the ID as a number and convert it into a character for display. For instance, to display 42 as a zero-padded, three-character string:
SQL> select to_char(42, '099') from dual;
042
Change the format string to fit your needs.
(I don't know if this is transferable to other SQL flavors, however.)

You could just concatenate '1' to the beginning of the ID when storing it in the database. When retrieving it, treat it as a string and remove the first char.
MySQL Example:
SET #student_id = '123456789';
INSERT INTO student_table (id,name) VALUES(CONCAT('1',#student_id),'John Smith');
...
SELECT SUBSTRING(id,1) FROM student_table;
Mathematically:
Initially I thought too much and did it mathematically by adding an integer to the student ID, depending on its length (like 1,000,000,000 if it's 9 digits), before storing it.
SET #new_student_id = ABS(#student_id) + POW(10, CHAR_LENGTH(#student_id));
INSERT INTO student_table (id,name) VALUES(#new_student_id,'John Smith');

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight