Narrow down an Access Design View export (follow-up to existing Q/A) - export

https://stackoverflow.com/a/61730969/2818235 answers the root question I have, but I want the "Export" described there to be more precise (so I don't have to edit the resulting Excel sheet quite so much). I'm wondering if what it's doing is done by a Query behind the scenes, and if so, is there a way to see that query so I can modify it to only output the fields I want. Thanks!

Related

Save an entire JFrame in a database

I have created a relatively large Jython JFrame, full of fields, tabs, tables and nested tables.
I honestly wouldn't want to take all these fields and create a super giant relational database, far from it.
I would just like to take the whole frame and save it, without taking field by field (I would never end up otherwise ...!).
I know this is not best practice, but I would highly prefer to do so, even thinking about future purposes.
Not being a backend expert, I was wondering which database technology (DBMS and similar) lends itself to being able to save a JFrame, therefore an object / class.
The goal, once the JFrame has been saved, is to recall it, with the related database query language you would suggest, and to reopen it and see all the fields filled in as I left them, nested elements included.
I sincerely hope that there is a way to be able to save a JFrame in a DB and extract it exactly as it was left.
This would give an important implication to my future projects.
Thanks in advance!

Adding new Excel files to MS Access database as they come in

I am in the situation where I have a questionnaire that is basically just a plain excel spreadsheet with two columns:
one column with the questions and
a second column next to it where users can fill in their answers.
Each respondent has been sent a copy of the file and they will email back their files individually over a long time period. I can't wait until i have all files back; instead i would like to collect (and use) the data in Access as the files come in.
Two questions:
What is the best set up in terms of the manual steps required when a new datafile comes in. Can one just save the file in a specific folder and somehow have the column (column B) with responses "automatically" added to the main database? If not fully automatically, what could be done with just a few manual steps involved?
I realize that the shape of the questionnaire is not ideal (variables are in rows, not in columns). What's the best way to deal with that?
Thanks in advance for any pointers!
PS: I'be open to (simple) alternatives, if Access is not the best choice for this. Analysis of the data will be done in Excel again in the end.
Update, to clarify the questions below:
1) In the short - medium term, we are expecting 50-100 replies. In the long term, it will be more as, people will be asked to send updates when their situation changes - these will have to be added as new entries with a new date attached to them. i.e. it will be a continuous process with a few answers coming in every few weeks.
2) There are 80 questions on the questionnaire.
3) The Excel files come back as email attachments.
4) I was contemplating using Acess, as I thought it will a) makeit a bit cleaner and less error prone, especially as project managers might change in the future, b) allow for better handling of the data, as it will have to be mashed up and reshaped in different ways for the anlysis (e.g. it has to be un-pivoted, which i don't even know if excel can do), and c) i thought it it would give us more flexibility in the future when it comes to using different tools for analysis. i.e. each tool can just query the database. I am open for other suggestions, including Excel-only solutions, if that makes it easier, though.
5) I envision the base table to have all the 80 variables in different columns, and the answers as rows (i.e. each new colum that comes with each excel file will need to be transposed and added as a new row). There will be other data tables with the same primary key as the row identifier in this table.
6) I havn't worked on the analysis part yet, but i know that it will require a lot of reshaping and merging of data sets.
Answer 1 - Questions
You do not provide enough information to allow any one to give you pointers. Some initial questions:
How many questionaires are you expecting: 10, 100, 1000?
How many questions are there per questionaire?
How are the questionaires reaching you? You say "email back". Does this mean as an attachment or as a table in the body of the email.
You say the data is arriving as Excel files and you intend to do the analysis in Excel. Why are you storing the answers in Access? I am not saying you are wrong to store the results in Access; I just want to be convinced you have a reason.
Have you designed the planned table structure for Access?
Have you designed the structure of the Excel workbook(s) on which you will perform the analysis?
Answer 2
Firstly, I should say that I agree with Mat. I am not an expert on questionnaires but my understanding is that there are companies that will host online questionnaires and provide the results in a convenient form.
Most of the rest of this answer assumes it is too late to consider an online questionnaire or you have, for whatever reason, rejected that approach.
An Access project is, to a degree, self-documenting. You can look at its list of tables and see that Table 1 has columns A, B and C. If created properly you can see the relationships between tables. With an Excel workbook you just have a number of worksheets which can contain anything. There is no automatic documentation.
However, with both Excel and Access the author can create complete documentation that explains each table, worksheet, report and macro. If this project is going to last indefinitely and have a succession of project managers, such documentation will be essential. I can tell you from bitter experience that trying to understand a complex Access project or Excel workbook that you have inherited without proper documentation is at best difficult and at worst impossible.
Don’t even start this unless you plan to create and maintain proper documentation. I do not mean: “We will knock up something when we have finished.” Once it is finished, people will be moving onto their next projects and will have little time for boring stuff like documentation. After the event documentation also loses all the decisions and the reasons for those decisions. The next team is left wondering why their predecessors did it that way. The reason will not matter in many cases but I have seen a product destroyed by a new team removing “unnecessary complexity” they did not understand. I always kept a notebook in which I recorded what I was doing and why during the day. I encouraged my staff to do the same. I insisted something for the project log every week. The level of detail depends on the project. The question I asked myself was: “If I had just inherited this project, what happened during the last week that I would need to know?” This was in addition to an up-to-date specification for each component.
Sorry, I will get off my hobby-horse.
“In the short - medium term, we are expecting 50-100 replies. In the long term, it will be more as, people will be asked to send updates when their situation changes - these will have to be added as new entries with a new date attached to them.”
If you are going to keep a history of answers then Access will probably be a better repository than Excel. However, who is going to maintain the Access project and the central Excel workbooks? Access does not operate in the same way as Excel. Access VBA is not quite the same as Excel VBA. This will not matter if you are employing professionals experienced in both Access and Excel. But if you are employing amateurs who are picking up the necessary skills on the job then using both Access and Excel will increase what they have to learn and the likelihood that they will get confused.
If there are only 100 people/organisations submitting responses, you could merge responses and maintain one workbook per respondent to create something like:
Answers -->
Question 1May2014 20Jun2014 7Nov2014
Aaaaaa aa bb cc
Bbbbbb dd ee ff
I am not necessarily recommending an Excel approach but it will have benefits in some circumstances. Personally, unless I was using professional programmers, I would start with an Excel only solution until I knew why I needed Access.
“I envision the base table to have all the 80 variables in different columns, and the answers as rows (i.e. each new colum that comes with each excel file will need to be transposed and added as a new row).” I interpret this to mean a row will contain:
Respondent identifier
Date
Answer to Q1
Answer to Q2
: :
Answer to Q80.
My Access is very rusty. Is there a way of accessing attribute “Answer to Q(n)” or are you going to need 80 statements to move answers in and out? I hope there is no possibility of new questions. I found updating the database when a row changed a pain. I always favoured small rows such as:
Respondent identifier
Date
Question number
Answer
There are disadvantages to having lots of small rows but I always found the advantages outweighed them.
Hope this helps.

Database design, avoiding redundant data on HTML forms

I'm trying to come up with a clean database design for a new project I'm working on. One of the data items I need to store in the database will come from an HTML form:
Q1: "Anticoagulated patient?" [YES, NO]
JavaScript;
(If yes is selected, an additional question is displayed):
Q2: "Type of Anticoagulant" [Warfari, Coumarin, Clopidogrel]
My question is, is it necessary to store the first question's response in the database? To me the data seems redundant. If the type is specified, then it can be assumed that the patient is "anticoagulated".
Once the form is submitted, the form will be accessed at a later point so the data can be ammended and the interface will need to reflect the state of the database. I should still be able to do this without needing to record the first question:
JavaScript;(If Q2 has a value, then the default option should be set to Yes
otherwise it should be set to No)
Q1: "Anticoagulated patient?" [Yes, No]
JavaScript;(Only display if Q1 is set to Yes):
Q2: "Type of Anticoagulant" [Warfari, Coumarin, Clopidogrel]
What are your thoughts on this?
I would say the extra space required to store the response to question 1 will be far outweighed by the amount of extra logic involved in marking Q2 as implying Q1. Keep it simple!
I think the two questions should indeed be handled on the client-side, as you suggested. This will provide a better user experience.
As for the database, don't use two fields, one is more than enough. Note that even if you don't show the second question dynamically, you can still store the answer somewhere else (such as the web server session) and not the back-end database.
I think , you have make right decision on this, but just think about the future and extending the application logic , maybe you need to add another question later then you will have to store the question in db,
Briefly for current situation it is adequate to store the answer.

Database design help with varying schemas

I work for a billing service that uses some complicated mainframe-based billing software for it's core services. We have all kinds of codes we set up that are used for tracking things: payment codes, provider codes, write-off codes, etc... Each type of code has a completely different set of data items that control what the code does and how it behaves.
I am tasked with building a new system for tracking changes made to these codes. We want to know who requested what code, who/when it was reviewed, approved, and implemented, and what the exact setup looked like for that code. The current process only tracks two of the different types of code. This project will add immediate support for a third, with the goal of also making it easy to add additional code types into the same process at a later date. My design conundrum is that each code type has a different set of data that needs to be configured with it, of varying complexity. So I have a few choices available:
I could give each code type it's own table(s) and build them independently. Considering we only have three codes I'm concerned about at the moment, this would be simplest. However, this concept has already failed or I wouldn't be building a new system in the first place. It's also weak in that the code involved in writing generic source code at the presentation level to display request data for any code type (even those not yet implemented) is not trivial.
Build a db schema capable of storing the data points associated with each code type: not only values, but what type they are and how they should be displayed (dropdown list from an enum of some kind). I have a decent db schema for this started, but it just feels wrong: overly complicated to query and maintain, and it ultimately requires a custom query to view full data in nice tabular for for each code type anyway.
Storing the data points for each code request as xml. This greatly simplifies the database design and will hopefully make it easier to build the interface: just set up a schema for each code type. Then have code that validates requests to their schema, transforms a schema into display widgets and maps an actual request item onto the display. What this item lacks is how to handle changes to the schema.
My questions are: how would you do it? Am I missing any big design options? Any other pros/cons to those choices?
My current inclination is to go with the xml option. Given the schema updates are expected but extremely infrequent (probably less than one per code type per 18 months), should I just build it to assume the schema never changes, but so that I can easily add support for a changing schema later? What would that look like in SQL Server 2000 (we're moving to SQL Server 2005, but that won't be ready until after this project is supposed to be completed)?
[Update]:
One reason I'm thinking xml is that some of the data will be complex: nested/conditional data, enumerated drop down lists, etc. But I really don't need to query any of it. So I was thinking it would be easier to define this data in xml schemas.
However, le dorfier's point about introducing a whole new technology hit very close to home. We currently use very little xml anywhere. That's slowly changing, but at the moment this would look a little out of place.
I'm also not entirely sure how to build an input form from a schema, and then merge a record that matches that schema into the form in an elegant way. It will be very common to only store a partially-completed record and so I don't want to build the form from the record itself. That's a topic for a different question, though.
Based on all the comments so far Xml is still the leading candidate. Separate tables may be as good or better, but I have the feeling that my manager would see that as not different or generic enough compared to what we're currently doing.
There is no simple, generic solution to a complex, meticulous problem. You can't have both simple storage and simple app logic at the same time. Either the database structure must be complex, or else your app must be complex as it interprets the data.
I outline five solution to this general problem in "product table, many kind of product, each product have many parameters."
For your situation, I would lean toward Concrete Table Inheritance or Serialized LOB (the XML solution).
The reason that XML might be a good solution is that:
You don't need to use SQL to pick out individual fields; you're always going to display the whole form.
Your XML can annotate fields for data type, user interface control, etc.
But of course you need to add code to parse and validate the XML. You should use an XML schema to help with this. In which case you're just replacing one technology for enforcing data organization (RDBMS) with another (XML schema).
You could also use an RDF solution instead of an RDBMS. In RDF, metadata is queriable and extensible, and you can model entities with "facts" about them. For example:
Payment code XYZ contains attribute TradeCredit (Net-30, Net-60, etc.)
Attribute TradeCredit is of type CalendarInterval
Type CalendarInterval is displayed as a drop-down
.. and so on
Re your comments: Yeah, I am wary of any solution that uses XML. To paraphrase Jamie Zawinski:
Some people, when confronted with a problem, think "I know, I'll use XML." Now they have two problems.
Another solution would be to invent a little Domain-Specific Language to describe your forms. Use that to generate the user-interface. Then use the database only to store the values for form data instances.
Why do you say "this concept has already failed or I wouldn't be building a new system in the first place"? Is it because you suspect there must be a scheme for handling them in common?
Else I'd say to continue the existing philosophy, and establish additional tables. At least it would be sharing an existing pattern and maintaining some consistency in that respect.
Do a web search on "generalized specialized relational modeling". You'll find articles on how to set up tables that store the attributes of each kind of code, and the attributes common to all codes.
If you’re interested in object modeling, just search on “generalized specialized object modeling”.

Best Practices: Storing a workflow state of an item in a database? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I have a question about best practices regarding how one should approach storing complex workflow states for processing tasks in a database. I've been looking online to no avail, so I figured I'd ask the community what they thought was best.
This question comes out of the same "BoxItem" example I gave in a prior question. This "BoxItem" is being tracked in my system as various tasks are performed on it. The task may take place over several days and with human interaction, so the state of the BoxItem must be persisted. Who did the task (if applicable), and when the task was done must also be tracked.
At first, I approached this by adding three fields to the "BoxItems" table for every human-interactive task that needed to be done:
IsTaskNameComplete
DateTaskNameComplete
UserTaskNameComplete
This worked when the workflow was simple... but now that it has grown to a complex process (> 10 possible human interactions in the flow... about half of which are optional, and may or may not be done for the BoxItem, which resulted in me beginning to add "DoTaskName" fields as well for those optional tasks), I've found that what should've been a simple table now has 40 or so field devoted entirely to the retaining of this state information.
I find myself asking if there isn't a better way to do it... but I'm at a loss.
My first thought was to make a generic "BoxItemTasks" table which defined the tasks that may be done on a given box, but I still would need to save the Date and User information individually, so it didn't really help.
My second thought was that perhaps it didn't matter, and I shouldn't worry if this table has 40 or more fields devoted to state retaining... and maybe I'm just being paranoid. But it feels like that's a lot of information to retain.
Anyways, I'm at a loss as far as what a third option might be, or if one of the two options above is actually reasonable. I can see this workflow potentially getting even more complex in the future, and for each new task I'm going to need to add 3-4 fields just to support the tracking of it... it feels like it's spiraling out of control.
What would you do in this situation?
I should note that this is maintenance of an existing system, one that was built without an ORM, so I can't just leave it up to the ORM to take care of it.
EDIT:
Kev, are you talking about doing something like this:
BoxItems
(PK) BoxItemID
(Other irrelevant stuff)
BoxItemActions
(PK) BoxItemID
(PK) BoxItemTaskID
IsCompleted
DateCompleted
UserCompleted
BoxItemTasks
(PK) TaskType
Description (if even necessary)
Hmm... that would work... it would represent a need to change how I currently approach doing SQL Queries to see which items are in what state, but in the long term something like this looks like it would work better (without having to make a fundamental design change like the Serialization idea represents... though if I had the time, I'd like to do it that way I think.).
So is this what you were mentioning Kin, or am I off on it?
EDIT: Ah, I see your idea as well with the "Last Action" to determine the current state... I like it! I think that might work for me... I might have to change it up a little bit (because at some point tasks happen concurrently), but the idea seems like a good one!
EDIT FINAL: So in summation, if anyone else is looking this up in the future with the same question... it sounds like the serialization approach would be useful if your system has the information pre-loaded into some interface where it's queryable (i.e. not directly calling the database itself, as the ad-hoc system I'm working on does), but if you don't have that, the additional tables idea seems like it should work well! Thank you all for your responses!
If I'm understanding correctly, I would add the BoxItemTasks table (just an enumeration table, right?), then a BoxItemActions table with foreign keys to BoxItems and to BoxItemTasks for what type of task it is. If you want to make it so that a particular task can only be performed once on a particular box item, just make the (Items + Tasks) pair of columns be the primary key of BoxItemActions.
(You laid it out much better than I did, and kudos for correctly interpreting what I was saying. What you wrote is exactly what I was picturing.)
As for determining the current state, you could write a trigger on BoxItemActions that updates a single column BoxItems.LastAction. For concurrent actions, your trigger could just have special cases to decide which action takes recency.
As the previous answer suggested, I would break your table into several.
BoxItemActions, containing a list of actions that the work flow needs to go through, created each time a BoxItem is created. In this table, you can track the detailed dates \ times \ users of when each task was completed.
With this type of application, knowing where the Box is to go next can get quite tricky, so having a 'Map' of the remaining steps for the Box will prove quite helpful. As well, this table can group like crazy, hundreds of rows per box, and it will still be very easy to query.
It also makes it possible to have 'different paths' that can easily be changed. A master data table of 'paths' through the work flow is one solution, where as each box is created, the user has to select which 'path' the box will follow. Or you could set up so that when the user creates the box, they select tasks are required for this particular box. Depends on our business problem.
How about a hybrid of the serialization and the database models. Have an XML document that serves as your master workflow document, containing a node for each step with attributes and elements that detail it's name, order in the process, conditions for whether it's optional or not, etc. Most importantly each step node can have a unique step id.
Then in your database you have a simple two table structure. The BoxItems table stores your basic BoxItem data. Then a BoxItemActions table much like in the solution you marked as the answer.
It's essentially similar to the solution accepted as the answer, but instead of a BoxItemTasks table to store the master list of tasks, you use an XML document that allows for some more flexibility for the actual workflow definition.
For what it's worth, in BizTalk they "dehydrate" long-running message patterns (workflows and the like) by binary serializing them to the database.
I think I would serialize the Workflow object to XML and store in the database with an ID column. It may be more difficult to report on, but it sounds like it may work in your case.

Resources