I am trying to automate a process where a team member is given a csv list with columns dealing with First names, last names, email addresses, addresses, phone numbers, and several other columns that are unique to each customer from a customer and then they have to upload the csv file into our system. For example, some customers could send a file with only first name, last name, and email addresses, while the next customer could send a file with those three along with certain ID columns that are directly valuable to them and them only.
The issue is each customer has different names for their columns than we have in our database, so the user has to manually map their columns to our columns in the database through a dropdown menu.
As you can see below, the name on their file upload could be "first_name" and it would match directly with our db column of "first_name". However some customers could have "f_name" or "first name" and we would have to map it directly in the drop down to the "first_name" column in our database.
Unfortunately these docs comes from the customer and we cannot automate it to have one csv document that everyone uses with the same exact columns. Is there a way to not have to map those manually and have that entered into the database automatically into the correct columns?
file upload
Related
I have multiple flat files(.csv) as my source in a folder.Each file has varying number of columns which may or may not intersect with other files. However, all columns in any source file are always present in my destination table that contains the super set of all these columns.
My requirement is to loop through each of these files and dynamically map columns that are available in that file to the destination table(header names of csv file match column names in table).
Structure of File 1:
id, name, age, email
Structure of File 2:
id, name, age, address, country
Structure of File 3:
id, name, age, address
Structure of Destination Table:
id, name, age, address, country, email
I want to populate the table for all columns with data for what is available and NULL for what's not for every record. How can I achieve this using SSIS?
you can do this by adding one Flat File Connection Manager add only one column with Data type DT_WSTR and a length of 4000 (assuming it's name is Column0)
In the dataflow task add a Script Component after the Flat File Source
In mark Column0 as Input Column and Add 6 Output Columns (id, name, age, address, country, email)
In the Input0_ProcessInputRow method split this column and assign values to output column. (you can write the logic you want) you can read answers at the following question to get an example: Reading CSV file some missing columns
The Flat File Source does not support dynamic file format, you have to use multiple sources to load these files.
I have an employee table with both Active and Terminated employees, one of my columns is work email. Terminated employees should have their personal emails listed and Active employees should have their work emails listed. However, the source system I am pulling the information from allows active employees to update their work email. So I have some active employees with their personal emails listed.
For my purposes, I need all active employees to have their work email listed. In SSIS, what would be the best approach to solving my issue?
Ex:
Name Status Email
Bob Act bob#workdomain.com
Joey Ter joey234#yahoo.com
Randy Act randy23#hotmail.com
Here, since Randy is as Active employee should have an email ending with #workdomain.com but, in the source system I pulled data from Randy changed his email to be his personal. Randy's email should be: randy#workdomain.com
I would have done like below (to avoid OLEDB transformation - row by row operation).
Load the data to 2 Staging tables (may be temporary tables):
Table#1. StageEmp:
Name Status Email
Bob Act bob#workdomain.com
Joey Ter joey234#yahoo.com
Randy Act randy23#hotmail.com
Table#2: CorrectEmail
Name Email
Randy randy23#hotmail.com
In Execute SQL Task:
Update s
SET s.email = c.email
FROM StageEmp s
Join CorrectEmail c
ON s.Name = c.Name
where s.status = 'Act'
AND s.email not like '%#workdomain.com'
You should use key column in place of Name.
Load the data of StageEmp to actual table or update directly to the actual table.
According to me, you have one Data Flow Task under which you have one Excel Source to fetch data from excel file. You can do this way -
1. Add one "Lookup" component and pass output of "Excel source" to it.
2. Edit "Lookup" component, and add second source of input to it from your existing table (where you have work email).
3. Join these two inputs to "Lookup" component on the basis of one of the key columns,
so that in output you have two columns one email column from source and one email column from destination.
3. Add "Derived column" component and pass output of "Lookup" to it.
4. Edit "Derived Column", add new column and add expression to check whether email column has "#work-domain" or not.
If yes, then source email else destination email.
Hope so i am able to explain.
i have two tables
student(studentid, name, class)
Course (courseID, CourseName, CreditHours)
these tables have many to many relation ship so i created a third table
student_course(studentID,CourseID)
Now i have designed the forms in ms access through which user enters data in student and course tables
But i want to design a from through which user can assign courses to students which means user
have to enter data in student_course table.
but i want to design a form though which users can see and enter the student name and course name insted of ID's at backend ID should store in student_course table,
Can anyone help me in this matter.
You can use a combo box .. try to insert a combo box on the form using the wizard and it should walk you through the steps in showing the course name in the front end and saving the course id in the back end database ...
Good morning!
I'm with the need to look in my database, one column of a table that does not know the name at first, what happens is the following:
In my application created for each project, a table is created which takes the name of this project, taking the given name and concatenating with the date and time of creation. So the name of this table is stored in another table called projects that have a field that tells the client that belongs to that project. When I do SELECT want to see the names of application projects related to the ID's of customers, browse the database tables behind those those customers and bring me these tables, so that we can finally see the desired fields.
Do not know if I could be clear, if they need more details just talk!
Thanks!
If I understood you correctly, you need to find the exact names of the tables that were named like your project plus they have some additional characters in their names (that look like dates and times).
Well, you can list all the tables that start with the name of your project, using a query like this:
SELECT *
FROM sys.tables
WHERE name LIKE 'yourprojectname%'
sys.tables is a system view where all your tables are listed.
'yourprojectname%' is a mask used for filtering through the list of tables. The % character is neccessary. It means 'any character or characters, any number of them (or none of them)'. (Without % the output would show you only one table whose name is exactly like your project's name. If such a table exists, that is.)
I am looking at a problem which would involve users uploading lists of records with various field structures into an application. The 2nd part of this would be to also allow the users to specify fields to capture information.
This is a step beyond anything ive done up to this point where i would have designed a static RDMS structure myself. In some respects all records will be treated the same so there will be some common fields required for each. Almost all queries will be run on these common fields.
My first thought would be to dynamically generate a new table for each import and another for each data capture field spec.Then have a master table with a guid for every record in the application along with the common fields and then fields that specify the name of the table the data was imported to and name of table with the data capture fields.
Further information (metadata?) about the fields in the dynamically generated tables could be stored in xml or in a 'property' table.
This would mean as users log into the application i would be dynamically choosing which table of data to presented to the user, and there would be a large number of tables in the database if it was say not only multiuser but then multitennant.
My question is are there other methods to solving this kind of varaible field issue, im i going down an unadvised path here?
I believe that EAV would require me to have a table defining the fields for each import / data capture spec and then another table with the import - field - values data and that seems impracticle.
I hate storing XML in the database, but this is a perfect example of when it makes sense. Store the user imports in XML initially. As your data schema matures, you can later decide which tables to persist for your larger clients. When the users pick which fields they want to query, that's when you come back and build a solid schema.
What kind is each field? Could the type of field be different for each record?
I am working on a program now that does this sorta and the way we handle it is basically a record table which points to a recordfield table. the recordfield table contains all of the fields along with the field name of the actual field in the database(the column name). We then have a recorddata table which is where all the data goes for each record. We also store a record_id telling it which record it is holding.
This is how we do it where if each column for the record is the same type, then we don't need to add new columns to the table, and if it has more fields or fields of a different type, then we add fields as appropriate to the data table.
I think this is what you are talking about.. correct me if I'm wrong.
I think that one additional table for each type of user defined field for the table that the user can add the fields to is a good way to go.
Say you load your records into user_records(id), that table would have an id column which is a foreign key in the user defined fields tables.
user defined string fields would go in user_records_string(id, name), where id is a foreign key to user_records(id), and name is a string, or a foreign key to a list of user defined string fields.
Searching on them requires joining them in to the base table, probably with a sub-select to filter down to one field based on the user meta-data, so that the right field can be added to the query.
To simulate the user creating multiple tables, you can have a foreign key in the user_records table that points at a table list, and filter on that when querying for a single table.
This would allow your schema to be static while allowing the user to arbitrarily add fields and tables.