A single MySQL query for 'bouncing' table selects - database

So, say for the sake of simplicity, I have a master table containing two fields - The first is an attribute and the second is the attributes value. If the second field is set to reference a value in another table it is denoted in parenthesis.
Example:
MASTER_TABLE:
Attr_ID | Attr_Val
--------+-----------
1 | 23(table1) --> 23rd value from `table1`
2 | ...
1 | 42 --> the number 42
1 | 72(table2) --> 72nd value from `table2`
3 | ...
1 | txt --> string "txt"
2 | ...
4 | ...
TABLE 1:
Val_Id | Value
--------+-----------
1 | some_content
2 | ...
. | ...
. | ...
. | ...
23 | some_content
. | ...
Is it possible to perform a single query in SQL (without parsing the results inside the application and requerying the db) that would iterate trough master_table and for the given <attr_id> get only the attributes that reference other tables (e.g. 23(table1), 72(table2), ...), then parse the tables names from the parenthesis (e.g. table1, table2, ...) and perform a query to get the (23rd, 72nd, ...) value (e.g. some_content) from that referenced table?
Here is something I've done, and it parses the Attr_Val for the table name, but I don't know how to assign it to a string and then do a query with that string.
PREPARE pstmt FROM
"SELECT * FROM information_schema.tables
WHERE TABLESCHEMA = '<my_db_name>' AND TABLE_NAME=?";
SET #str_tablename =
(SELECT table.tablename FROM
(SELECT #string:=(SELECT <string_column> FROM <table> WHERE ID=<attr_id>) as String,
#loc1:=length(#string)-locate("(", reverse(#string))+2 AS from,
#loc2:=length(#string)-locate(")", reverse(#string))+1-#loc1 AS to,
substr(#string,#loc1, #loc2) AS tablename
) table
); <--this returns 1 rows which is OK
EXECUTE pstmt USING #str_tablename; <--this then returns 0 rows
Any thoughts?

I love the purity of this approach, if pulled off. But I'm thinking you're creating a maintenance bomb. With a cure like this, who needs to be sick?
No one has ever said of a web site "Man, their data sure is pure!" They compliment what is being done with the data. I don't recommend you keep your hands tied behind your back on this one. I guarantee your competitors aren't.

Related

Use excel to summarise data from a column by identifier

I have a spreadsheet with a column called MRN (the identifier) and the drugs administered next to them. There are duplicates of the MRN in column A that correspond to different courses of drugs. What I'm hoping to do is to summarise all the drugs administered associated with one MRN in one line, removing all duplicates. It looks something like this.
| | A | B |
| 1 | MRN Item
| 2 | 1 cefoTAXime
| 3 | 1 ampicillin
| 4 | 1 cefoTAXime
| 5 | 1 vancomycin
| 6 | 1 cefTRIaxone
| 7 | 2 ampicillin
| 8 | 2 vancomycin
| 9 | 2 vancomycin
I have 3 different formulas. The first is to produce a list of MRNs that are all unique. The second is to pull all drugs by MRN and list them in one line. The third is to remove duplicates from this list. They are below (in order).
{=IFERROR(INDEX($A$2:$A$2885, MATCH(0,COUNTIF(D$1:$D1, $A$2:$A$2885),0 )),"")}
{=INDEX($A$2:$B$2885,SMALL(IF($A$2:$A$2885=$D2,ROW($A$2:$A$2885)),COLUMN(D:D))-4,2)}
{=IFERROR(INDEX($E$2:$AE$2, MATCH(0,COUNTIF(D$3:$D3, $E$2:$AE$2),0 )),"")}
*I know that I can edit the second one by adding IF(ISERROR ...) to remove NA and print blanks if drug not found, but want to keep the formulas as simple as possible at this time.
My problem is that second formula isn't pulling all the drugs by MRN, and in an ideal world I would be able to combine the second and third formula into one, but I am not sure how to. Here is a link to a test file that shows my issue and the formulas in action.
https://1drv.ms/x/s!ApoCMYBhswHzhooXnumW2iV7yx-JaA
I appreciate that there may be a better way to do this using python/R, and if that's possible then I'm more than happy to try, but I couldn't make any headway. Thanks for your help and suggestions.
If you could deal with a count of the number of courses per drug per MRN, you can do this with Power Query (aka Get & Transform in Excel 2016)
Starting with the data you provided on your worksheet, the results would look like:
M-Code
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"MRN", Int64.Type}, {"Item", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"MRN"}, {{"Count", each _, type table}}),
#"Expanded Count" = Table.ExpandTableColumn(#"Grouped Rows", "Count", {"MRN", "Item"}, {"Count.MRN", "Count.Item"}),
#"Pivoted Column" = Table.Pivot(#"Expanded Count", List.Distinct(#"Expanded Count"[Count.Item]), "Count.Item", "Count.MRN", List.NonNullCount)
in
#"Pivoted Column"

Use TQuery.Locate() function to find other then first matching

Locate moves the cursor to the first row matching a specified set of search criteria.
Let's say that q is TQuery component, which is connected to the database with two columns TAG and TAGTEXT. With next code I am getting letter a. And I would like to use Locate() function to get letter d.
If q.Locate('TAG','1',[loPartialKey]) Then
begin
tag60 := q.FieldByName('TAGTEXT');
end
For example if I got table like this:
TAG | TAGTEXT
+---+--------+
| 1 | a |
+---+--------+
| 2 | b |
+---+--------+
| 3 | c |
+---+--------+
| 1 | d |
+---+--------+
| 4 | e |
+---+--------+
| 1 | f |
+---+--------+
is it possible to locate the second time number one occurred in table?
EDIT
My job is to find the occurrence of TAG with value 1 (which occurrence I need depends on the parameter I get), I need to iterate through table and get the values from all the TAGTEXT fields till I find that value in TAG field is again number 1. Number 1 in this case represents the start of new segment, and all between the two number 1s belongs to one segment. It doesn't have to be same number of rows in each segment. Also I am not allowed to do any changes on table.
What I thought I could do is to create a counter variable that is going to be increased by one every time it comes to TAG with value 1 in it. When the counter equals to the parameter that represents the occurrence I know that I am in the right segment and I am going to iterate through that segment and get the values I need.
But this might be slow solution, and I wanted to know if there was any faster.
You need to be a bit wary of using Locate for a purpose like this, because some
TDataSet descendants' implementation of Locate (or the underlying db-access layer) construct a temporary index on the dataset. which can be discarded immediately afterwards, so repeatedly calling Locate to iterate the rows of a given segment may be a lot more inefficient than one might expect it to be.
Also, TClientDataSet constructs, uses and then discards an expression parser for each invocation of Locate (in its internal call to LocateRecord), which is a lot of overhead for repeated calls, especial when they are entirely avoidable.
In any case, the best way to do this is to ensure that your table records which segment a given row belongs to, adding a column like the SegmentID below if your table does not already have one:
TAG | TAGTEXT|SegmentID
+---+--------+---------+
| 1 | a | 1
| 2 | b | 1
| 3 | c | 1
| 1 | d | 2
+---+--------+---------+ // btw, what happened to the 2 missing rows after this one?
| 4 | e | 2
| 1 | f | 3
+---+--------+---------+
Then, you could use code like this to iterate the rows of a segment:
procedure IterateSegment(Query : TSomeTypeOfQueryComponent; SegmentID : Integer);
var
Sql; String;
begin
Sql := Format('select * from mytable where SegmentID = %d order by Tag', [SegmentID]);
if Query.Active then
Query.Close;
Query.Sql.Text := Sql;
Query.Open;
Query.DisableControls;
try
while not Query.Eof do begin
// process row here
Query.Next;
end;
finally
Query.EnableControls;
end;
end;
Once you have the SegmentID column in the table, if you don't want to open a new query to iterate a block, you can set up a local index (by SegmentID then Tag), assuming your dataset type supports it, set a filter on the dataset to restrict it to a given SegmentID and then iterate over it
You have much options to do this.
If your component donĀ“t provide a locateNext you can make your on function locateNext, comparing the value and make next until find.
You can also bring the sql with order by then use locate for de the first value and test if the next value match the comparision.
If you use a clientDataset you can filter into the component filter propertie, or set IndexFieldNames to order values instead the "order by" of sql in the prior suggestion.
You can filter it on the SQL Where clausule too.

Can I set rules for string comparison in SQL? (or do I need to hardcode using CASE WHEN)

I need to make a comparison for ratings in two points in time and indicate if the change was upwards,downwards or stayed the same.
For example:
This would be a table with four columns:
ID T0 T0+1 Status
1 AAA AA Lower
2 BB A Higher
3 C C Same
However, this does not work when applying regular string comparison, because in SQL
A<B
B<BBB
I need
A>B
B<BBB
So my order(highest to lowest): AAA,AA,A,BBB,BB,B
SQL order(highest to lowest): BBB,BB,B,AAA,AA,A
Now I have 2 options in mind, but I wonder if someone know a better one:
1) Use CASE WHEN statements for all the possibilities of ratings going up and down ( I have more values than indictaed above)
CASE WHEN T0=T0+1 then 'Same'
WHEN T0='AAA' and To+1<>'AAA' then 'Lower'
....adress all other options for rating going down
ELSE 'Higher'
However, this generates a very large number of CASE WHEN statements.
2) My other option requires generating 2 tables. In table 1 I use case when statements to assign values/rank to the ratings.
For example:
CASE WHEN T0='AAA' then 6
CASE WHEN T0='AA' then 5
CASE WHEN T0='A' then 4
CASE WHEN T0='BBB' then 3
CASE WHEN T0='BB' then 2
CASE WHEN T0='B' then 1
The same for T0+1.
Then in table 2 I use a regular compariosn between column T0 and Column T0+1 on the numeric values.
However, I am looking for a solution where I can do it in one table (with as little lines as possible), and optimally never really show the ranking column.
I think a nested statement would be the best option, but it did now work for me.
Anybody has suggestions?
I use SQL Server 2008.
If you are using Credit Rating, this is very likely that this is not just about AAA > AA or BBB > BB.
Whether you are using one agency or another, it could also be AA+ or Aa1 for long term, F1+ for short term or something else in different contexts or with other agencies.
It is also often requiered to convert data from one agency to other agencies Rating.
Therefore it is better to use a mapping table such as:
Id | Rating
0 | AAA
1 | AA+
2 | AA
3 | AA-
4 | A+
5 | A
6 | A-
7 | BBB+
Using this table, you only have to join the rating in your data table with the rating in the mapping table:
SELECT d.Rating_T0, d.Rating_T1
CASE WHEN d.Rating_T0 = d.Rating_T1 THEN '='
WHEN m0.id < m1.id THEN '<'
WHEN m0.id > m1.id THEN '>'
END
FROM yourData d
INNER JOIN RatingMapping m0
ON m0.Rating= d.Rating_T0
INNER JOIN RatingMapping m1
ON m1.Rating= d.Rating_T1
If you only store the Rating id in you data table, you will not only save space (1 byte for tinyint versus up to 4 chars) but will also be able to compare without the JOIN to the mapping table.
SELECT d.Rating_Id0, d.Rating_Id1
CASE WHEN d.Rating_Id0 = d.Rating_Id1 THEN '='
WHEN d.Rating_Id0 < d.Rating_Id1 THEN '<'
WHEN d.Rating_Id0 > d.Rating_Id1 THEN '>'
END
FROM yourData d
The JOIN would only be requiered when you want to display the actual Rating value such as AAA for Rating_ID = 0.
You could also add an agency_Id to the Mapping table. This way, you can easily choose which Notation agency you want to display and easily convert between Agency 1 and Agency 2 or Agency 3 (ie. Id 1 => S&P and Id 2 => Fitch, Id 3 => ...)

sphinx - Column count doesn't match

I have the following in my sphinx
mysql> desc rec;
+-----------+---------+
| Field | Type |
+-----------+---------+
| id | integer |
| desc | field |
| tid | uint |
| gid | uint |
| no | uint |
+-----------+---------+
And I ran the following successfully in sphinx sql
replace into rec VALUES ('24','test test',1,1, 1 );
But when I run in the C mysql API I get this error
Column count doesn't match value count at row 1
the c code is this
if (mysql_query(con, "replace into rec VALUES ('24','test test',1,1, 1 )") )
{
fprintf(stderr, "%s\n", mysql_error(con));
mysql_close(con);
exit(1);
}
Please note that the C program is connecting to the sphinx sql with no issues
One problem may be that you are quoting the integer for the id column. I would try taking out the single quotes around the 24. The column named desc is also concerning, since that is a reserved word in MySQL.
A good best practice is to always specify the column names, even if you are inserting into all columns. The reason is that you may want to alter the table later to add a column and you don't necessarily want to go back and change all your code to match the new structure. It also makes your code clearer since you don't have to reference the table structure to know what the values mean and it helps in case a tool like Sphinx is using a different order for the columns than you expect. Try changing your code to this, which specifies the columns and quotes them (mysql uses backticks for quotes) and also removes the quotes around the value for the id column:
if (mysql_query(con, "replace into rec (`id`, `desc`, `tid`, `gid`, `no`) VALUES (24, 'test test', 1, 1, 1)") )

How to retrieve multiple values when more than one string matches with TSQL?

Consider the a table that contains
ReturnValueID | ReturnValue TriggerValue
------------------------------------------
1 | returnValue1 | testvalue
2 | returnValue2 | testing...
3 | returnValue3 | value3
And given a string: HERE IS THE TEXT testing... AND MORE TEXT testvalue MORE TEXT
I have written a CTE using SQL Server 2008 that uses a FindInString() function I wrote to indicate where the matched text is found. 0 = not found:
1 | returnValue1 | 43
2 | returnValue2 | 18
3 | returnValue3 | 0
What I need to do now, is iterate through this result set in a loop where I will perform some additional logic based on each row.
I have seen a few examples of looping, but I would rather not use a cursor.
What is the best way to approach this?
Thanks.
-- UPDATE --
Once a match is made, the ID of the matched row is added to a table, if it doesn't already exist, then the return value is appended to an VARCHAR value, if it doesn't already exist in the dynamic string:
IF NOT EXISTS -- check if this value is already recorded
(
SELECT *
FROM RecordedReturnValue
WHERE ReturnValueID = #ReturnValueID
)
BEGIN
-- add the visitor/external tag ID to historical table
INSERT INTO RecordedReturnValue (...)
VALUES (...)
-- function checks if string is already present
SET #DynamicString = dbo.AppendDynamicOutput(#ReturnValue, #DynamicString)
END
This must be performed for each matched TriggerValue from the CTE.
Ended up using a CTE, added the values to a temp table, then iterated through the results and performed some logic.

Resources