Naming convention of .unl files - export

I would like to understand which is the rule that is applied for the naming convention of the .unl files generated once I use dbexport on my Informix database.
I understand that:
+ Each file refers to a different table
+ Each file uses 5 letters and 5 numbers (example: tbabc01234.unl, if table name is tbabcdef).
+ The 5 letters are the first 5 letters of the table name.
+ If the table name has less than 5 letters, it completes it with underscores (example: tby__01234.unl, if table name is tby)
I would like to understand what is the meaning of the 5 numbers. Is there a way to guarantee that the list is alphabetically generated?

The 5 digits number used on the unload file is the tabid of the table.
I don't believe there is a way to guarantee that the list is alphabetically generated.
The dbexport and dbimport utilities
To get the tabid of a table query the systables system catalog table:
SELECT tabname, tabid
FROM systables
WHERE tabname = '<TABNAME>';
For a given list:
SELECT tabname, tabid
FROM systables
WHERE tabname IN (
'<TABNAME1>',
'<TABNAME2>',
'<TABNAME3>',
...
)
ORDER BY tabid;
For all user tables:
SELECT tabname, tabid
FROM systables
WHERE tabid > 99
AND tabtype = 'T'
ORDER BY tabid;
Needless to say you can reverse search by tabid:
SELECT tabname, tabid
FROM systables
WHERE tabid = '<TABNAME>';
SELECT tabname, tabid
FROM systables
WHERE tabid IN (
101,
102,
103,
...
)
ORDER BY tabid;
One way of mapping is:
SELECT tabname,
tabid,
RPAD(SUBSTR(tabname,0, 5), 5, '_')||tabid||'.unl' AS unl_file,
FROM systables
WHERE tabid > 99
AND tabtype = 'T'
ORDER BY tabid;
For the question if this is a good approach, it depends on some questions:
How many tables the source database has?
How many tables you intend to migrate?
The process should account for relations between tables and data (PK/FK/Triggers/...).
Remember the storage clause mentions the dbspaces when using the ss option.
...
For example, if you have a database with 1000 tables and just want 10, then it is easier to extract the schema from dbschema and then perform an unload of the data.
Here is a link of the list of Data migration utilities that comes with the engine.

Related

Full Text Search On Relational Tables?

I have an sql server database that has multiple tables that are all related to each other.
I am wondering is it possible to use FreeTextTable or ContainsTable for this?
I have only seen examples of looking at one table and search all the columns. I have not seen a case where I may have say a form. Lets call it "student application form".
On this form it may have information like
Student First Name
Student Last Name
Address
Campus they wish to study at
Tell us about yourself
Now I want to build 1 search box that will search all these tables and find this "application"
the user might type in
John Smith
Main Campus
motivated
So all tables and columns would need to be checked, but end result would be brining back the application(s) that the full text search thinks matches what was typed in.
The table structure might be like this
Application
id
firstName
lastName
campusId
AddressId
Campus
id
name
Address
id
-Name
In my real database I got like 5 or 6 tables that join with the "application" table and would all need to be accounted for.
Can this be done with full text search? Or do I have to search each table individually and somehow tie it all together again.
You can use an indexed view that concatenante all values that you want to search for...
CREATE VIEW dbo.V_FT
WITH SCHEMABINDING
AS
SELECT A.id,
CONCAT(firstName, '#', lastName, '#', C.name, '#',D.name) AS DATA
FROM dbo.Application
JOIN dbo.Campus AS C ON C.id = A.campusId
JOIN dbo.Address AS D ON C.id = A.AddressId;
GO
CREATE UNIQUE CLUSTERED INDEX X_V_APP ON dbo.V_FT (id);
GO
Then create an FT index on DATA colums of the indexed view

What does master..sysdatabases and test..sysobjects mean?

I am learning how to use SQL server recently. I do not understand why use master..sysdatabases and test..sysobjects in the following statements:
select name from [master]..[sysdatabases] where dbid=1;
select count(1) from [test]..[sysobjects] where xtype = 'U';
What does the 1 in count(1) mean? Does it mean the first column?
Thanks for any helpful answers.
Your first line basically gets the name of the master database (it looks at the list of all databases, and returns the name of the database with the ID of 1, which in this case is generally going to be 'master').
Do a to see all the databases on a server:
SELECT * FROM [master]..[sysdatabases]
Note that the row with "dbid" = 1, is the row for the "master" database, which is a system database present on all SQL Server instances.
Your second line counts the number of rows in the sysobjects collection in the database named 'test' where the type is a user table (i.e. not a stored procedure, not a system table, etc).
In the expression "[x]..[y]", the 'x' is the name of the database, and 'y' is the name of the table or view within that database.
If you had a database named "Foo", and in there was a table named "Bar", then this statement would return the count of rows in that table:
SELECT COUNT(1) FROM [Foo]..[Bar]
As Ed Gibbs above described, the '1' is just a place-holder for counting the total number of rows in the most efficient way possible on any possible database or version. It's become a sort of short-hand way of counting.

SQL Server Free Text Search vs In clause

I am currently using IN clause on a varchar field. Will using Contains of FTS help in performance?
For e.g.
Select * from Orders where City IN (‘London’ , ‘New York’)
vs
Select * from Orders where Contains (City, ‘London or New York’)
Thanks in advance.
Table Definition
CREATE TABLE Orders(ID INT PRIMARY KEY NOT NULL IDENTITY(1,1),City VARCHAR(100))
GO
INSERT INTO Orders
VALUES ('London'),('Newyork'),('Paris'),('Manchester')
,('Liverpool'),('Sheffield'),('Bolton')
GO
Create FTS on City Column using ID as the key
Used SSMS to create FTS Index.
Queries
-- Query 1
Select * from Orders
where City IN ('London' , 'NewYork')
GO
-- Query 2
Select * from Orders where
Contains (City, '"London" or "NewYork"')
GO
Execution Plans for both queries
As you can see The Query which used FTS costed 3 times more than the query which used IN Operator.
Having said this, when it comes to find Language specific terms in sql server FTS is the way to go, for example looking for Inflectional forms , Synonymous and much more Read Here for more information.

Database tables: One-to-many of different types

Due to non-disclosure at my work, I have created an analogy of the situation. Please try to focus on the problem and not "Why don't you rename this table, m,erge those tables etc". Because the actual problem is much more complex.
Heres the deal,
Lets say I have a "Employee Pay Rise" record that has to be approved.
There is a table with single "Users".
There are tables that group Users together, forexample, "Managers", "Executives", "Payroll", "Finance". These groupings are different types with different properties.
When creating a "PayRise" record, the user who is creating the record also selects both a number of these groups (managers, executives etc) and/or single users who can 'approve' the pay rise.
What is the best way to relate a single "EmployeePayRise" record to 0 or more user records, and 0 or more of each of the groupings.
I would assume that the users are linked to the groups? If so in this case I would just link the employeePayRise record to one user that it applies to and the user that can approve. So basically you'd have two columns representing this. The EmployeePayRise.employeeId and EmployeePayRise.approvalById columns. If you need to get to groups, you'd join the EmployeePayRise.employeeId = Employee.id records. Keep it simple without over-complicating your design.
My first thought was to create a table that relates individual approvers to pay rise rows.
create table pay_rise_approvers (
pay_rise_id integer not null references some_other_pay_rise_table (pay_rise_id),
pay_rise_approver_id integer not null references users (user_id),
primary key (pay_rise_id, pay_rise_approver_id)
);
You can't have good foreign keys that reference managers sometimes, and reference payroll some other times. Users seems the logical target for the foreign key.
If the person creating the pay rise rows (not shown) chooses managers, then the user interface is responsible for inserting one row per manager into this table. That part's easy.
A person that appears in more than one group might be a problem. I can imagine a vice-president appearing in both "Executive" and "Finance" groups. I don't think that's particularly hard to handle, but it does require some forethought. Suppose the person who entered the data changed her mind, and decided to remove all the executives from the table. Should an executive who's also in finance be removed?
Another problem is that there's a pretty good chance that not every user should be allowed to approve a pay rise. I'd give some thought to that before implementing any solution.
I know it looks ugly but I think somethimes the solution can be to have the table_name in the table and a union query
create table approve_pay_rise (
rise_proposal varchar2(10) -- foreign key to payrise table
, approver varchar2(10) -- key of record in table named in other_table
, other_table varchar2(15) );
insert into approve_pay_rise values ('prop000001', 'e0009999', 'USERS');
insert into approve_pay_rise values ('prop000001', 'm0002200', 'MANAGERS');
Then either in code a case statement, repeated statements for each other_table value (select ... where other_table = '' .. select ... where other_table = '') or a union select.
I have to admit I shudder when I encounter it and I'll now go wash my hands after typing a recomendation to do it, but it works.
Sounds like you'd might need two tables ("ApprovalUsers" and "ApprovalGroups"). The SELECT statement(s) would be a UNION of UserIds from the "ApprovalUsers" and the UserIDs from any other groups of users that are the "ApprovalGroups" related to the PayRiseId.
SELECT UserID
INTO #TempApprovers
FROM ApprovalUsers
WHERE PayRiseId = 12345
IF EXISTS (SELECT GroupName FROM ApprovalGroups WHERE GroupName = "Executives" and PayRiseId = 12345)
BEGIN
SELECT UserID
INTO #TempApprovers
FROM Executives
END
....
EDIT: this would/could duplicate UserIds, so you would probably want to GROUP BY UserID (i.e. SELECT UserID FROM #TempApprovers GROUP BY UserID)

joining latest of various usermetadata tags to user rows

I have a postgres database with a user table (userid, firstname, lastname) and a usermetadata table (userid, code, content, created datetime). I store various information about each user in the usermetadata table by code and keep a full history. so for example, a user (userid 15) has the following metadata:
15, 'QHS', '20', '2008-08-24 13:36:33.465567-04'
15, 'QHE', '8', '2008-08-24 12:07:08.660519-04'
15, 'QHS', '21', '2008-08-24 09:44:44.39354-04'
15, 'QHE', '10', '2008-08-24 08:47:57.672058-04'
I need to fetch a list of all my users and the most recent value of each of various usermetadata codes. I did this programmatically and it was, of course godawful slow. The best I could figure out to do it in SQL was to join sub-selects, which were also slow and I had to do one for each code.
This is actually not that hard to do in PostgreSQL because it has the "DISTINCT ON" clause in its SELECT syntax (DISTINCT ON isn't standard SQL).
SELECT DISTINCT ON (code) code, content, createtime
FROM metatable
WHERE userid = 15
ORDER BY code, createtime DESC;
That will limit the returned results to the first result per unique code, and if you sort the results by the create time descending, you'll get the newest of each.
I suppose you're not willing to modify your schema, so I'm afraid my answe might not be of much help, but here goes...
One possible solution would be to have the time field empty until it was replaced by a newer value, when you insert the 'deprecation date' instead. Another way is to expand the table with an 'active' column, but that would introduce some redundancy.
The classic solution would be to have both 'Valid-From' and 'Valid-To' fields where the 'Valid-To' fields are blank until some other entry becomes valid. This can be handled easily by using triggers or similar. Using constraints to make sure there is only one item of each type that is valid will ensure data integrity.
Common to these is that there is a single way of determining the set of current fields. You'd simply select all entries with the active user and a NULL 'Valid-To' or 'deprecation date' or a true 'active'.
You might be interested in taking a look at the Wikipedia entry on temporal databases and the article A consensus glossary of temporal database concepts.
A subselect is the standard way of doing this sort of thing. You just need a Unique Constraint on UserId, Code, and Date - and then you can run the following:
SELECT *
FROM Table
JOIN (
SELECT UserId, Code, MAX(Date) as LastDate
FROM Table
GROUP BY UserId, Code
) as Latest ON
Table.UserId = Latest.UserId
AND Table.Code = Latest.Code
AND Table.Date = Latest.Date
WHERE
UserId = #userId

Resources