I have a development version of salesforce that I'd like to delete all the records from. My analyst is telling me that to do so manually will take a week, or we can refresh it which will take minutes to a week.
It seems to me that there must be a better way of doing this. In SQL, I'd make a list of tables to truncate and dynamically truncate each one. Is there some way to do this with salesforce?
It won't take you a week to refresh a sandbox, usually only takes 30 minutes. You can also write an execute anon script that will delete data from your tables
List<YourObject> objectData = [SELECT ID FROM YourObject];
delete objectData;
Related
I've got a CSV file that refreshes every 60 seconds with live data from the internet. I want to automatically update my Access database (on a 60 second or so interval) with the new rows that get downloaded, however I can't simply link the DB to the CSV.
The CSV comes with exactly 365 days of data, so when another day ticks over, a day of data drops off. If i was to link to the CSV my DB would only ever have those 365 days of data, whereas i want to append the existing database with the new data added.
Any help with this would be appreciated.
Thanks.
As per the comments the first step is to link your CSV to the database. Not as your main table but as a secondary table that will be used to update your main table.
Once you do that you have two problems to solve:
Identify the new records
I assume there is a way to do so by timestamp or ID, so all you have to do is hold on to the last ID or timestamp imported (that will require an additional mini-table to hold the value persistently).
Make it happen every 60 seconds. To get that update on a regular interval you have two options:
A form's 'OnTimer' event is the easy way but requires very specific conditions. You have to make sure the form that triggers the event is only open once. This is possible even in a multi-user environment with some smart tracking.
If having an Access form open to do the updating is not workable, then you have to work with Windows scheduled tasks. You can set up an Access Macro to run as a Windows scheduled task.
Using SQL, it is taking over 4 hours every evening to pull over all the data from the twelve Production database tables or views needed for our Sandbox database. There has to be a significantly more efficient and effective manner to get this data into our Sandbox.
Currently, I'm creating a UID (Unique ID) by concatenating the views Primary Keys and system date fields.
The UID is used in two steps:
Step 1.
INSERT INTO Sandbox
WHERE UID IS NULL
and only Looking back the Last 30 Days based on the System Date
(using Left Join the Production Table/View.UID to the Existing Sandbox Table/View.UID)
Step 2.
UPDATE Sandbox
Where Production.UID = Sandbox.UID
(using an Inner Join of the Production Table/View.UID to the Existing Sandbox Table/View.UID)
I've cut the 4 hour time down to 2 hours, but it feels like this process I've created is missing a (big) step.
How can I cut this time down? Should I put a 30 day filter on my UPDATE statement as well?
Assuming you're not moving billions of rows into your development environment, I would just create a simple ETL strategy to truncate the dev environment and do a full load from production. If you don't want the full dataset, add a filter to the source queries for your ETL. Just make sure that doesn't have any effect on the integrity of the data.
If your data is in the billions, you likely have a enterprise storage solution in place. Many of those can handle snapshotting the data files to another location. There are some security aspects with that approach that you'll need to consider as well.
I found an answer that is in two parts. It may not be the best solution, but it seems to be working for the moment.
I can use primary keys as my UID from the production box database tables (for the most part). Updating them using a 30-90 day filter
The views are a bit trickier as they union two exact tables and have duplicate primary keys. So, I created my own uid concatenating multiple primary key fields and updating with a 30-90 day filter.
The previous process would take up to 4+ hours to complete. The new process runs in an hour, and seems to be working for the moment.
I've tried to search for some ideas but can't find anything that's very suitable for my scenario.
I have a table which I write and updata data to from multiple sites, maybe a row per second for specific hours of the day and on average having around 50k records added daily. Seperate to this, I have dashboards where people can query this data but some of the queries may be quite complex and take a number of seconds to complete.
I can't afford my write/updates to slow down
Although the dashboards don't need to be real time, it would be a bonus
Im hosting on Azure DB S2. What options are available?
Current idea is to use an 'active' table for writes/updates and flush the data to the full table every x min. My only concern is that I have a seeded bigint as a PK on the main table and because I also save other data linked to this, I'd have problems linking to this id until I commit to the main table. An option would be to reseed the active table and set identity insert off on the main table to populate it myself but I'm not 100% happy with this.
Just looking for suggestions until I go ahead with my current idea! Thanks
Okay, just to clarify: I have a SQL Table (contains ID, School, Student ID, Name, Fee $, Fee Type, and Paid (as the columns)) that needs to be posted on a Grid that will uploaded on a website. The Grid shows everything correctly and shows what Fees need to be Paid. The Paid column has a bit data type for 1 or 0 (basically a checklist.) I am being asked to add two more columns: User and DateChanged. The reason why is to log which staff changed the "Paid" column. It would only capture the Username of the staff who changed it in the SQL Table and also the time. So to clarify even more, I need to create 2 columns: "User, DateChanged" and the columns would log when someone changed the "Paid" column.
For example: User:Bob checks the Paid column for X student on 5/2/17 at 10pm.
In the same row of X student's info, under User column Tom would appear there. Under DateChanged it would show 2017-05-02 10pm.
What steps would I take to make this possible.
I'm currently IT Intern and all this SQL stuff is new to me. Let me know if you need more clarification. FYI The two new columns: User, DateChanged will not be on the grid.
The way to do this as you've described is to use a trigger. I have an example of some code below but be warned as triggers can have unexpected side-effects, depending on how the database and app interface are set up.
If it is possible for you to change the application code that sends SQL queries to the database instead, that would be much safer than using a trigger. You can still add the new fields, you would just be relying on the app to keep them updated instead of doing it all in SQL.
Things to keep in mind about this code:
If any background processes or procedures make updates to the table, it will overwrite the timestamp and username automatically, because this is triggered on any update to the row(s) in question.
If the users don't have any direct access to SQL Server (in other words, the app is the only thing connecting to the database), then it is possible that the app will only be using one database login username for everyone, and in that case you will not be able to figure out which user made the update unless you can change the application code.
If anyone changes something by accident and then changes it back, it will overwrite your timestamp and make it look like the wrong person made the update.
Triggers can potentially bog down the database system if there are a very large number of rows and/or a high number of updates being made to the table constantly, because the trigger code will be executed every time an update is made to a row in the table.
But if you don't have access to change the application code, and you want to give triggers a try, here's some example code that should do what you are needing:
create trigger TG_Payments_Update on Payments
after update
as
begin
update Payments
set DateChanged = GetDate(), UserChanged = USER_NAME()
from Payments, inserted
where Payments.ID = inserted.ID
end
The web app already knows the current user working on the system, so your update would just include that user's ID and the current system time for when the action took place.
I would not rely on SQL Server triggers since that hides what's going on within the system. Plus, as others have said, they have side effects to deal with too.
I need to move the data that is a month old from a logging table to a logging-archive table, and remove data older than a year from the later.
There are lots of data (600k insert in 2 months).
I was considering to simply call (batch) a stored proc every day/week.
I first thought about doing 2 stored proc :
Deleting from the archives what is older than 365 days
Moving the data from logging to archive, what is older than 30 days (I suppose there's a way to do that with 1 sql query)
Removing from logging what is older than 30 days.
However, this solution seems quite ineficient and will probably lock the DB for a few minutes, which I do not want.
So, do I have any alternative and what are they?
None of this should lock the tables that you actually use. You are writing only to the logging table currently, and only to new records.
You are selecting from the logging table only OLD records, and writing to a table that you don't write to except for the archive process.
The steps you are taking sound fine. I would go one step further, and instead of deleting based on date, just do an INNER JOIN to your archive table on your id field - then you only delete the specific records you have archived.
As a side note, 600k records is not very big at all. We have production DBs with tables over 2billion rows, and I know some other folks here have dbs with millions of inserts a minute into transactional tables.
Edit:
I forgot to include originally, another benefit of your planned method is that each step is isolated. If you need to stop for any reason, none of your steps is destructive or depends on the next step executing immediately. You could potentially archive a lot of records, then run the deletes the next day or overnight without creating any issues.
What if you archived to a secondary database.
I.e.:
Primary database has the logging table.
Secondary database has the archive table.
That way, if you're worried about locking your archive table so you can do a batch on it, it won't take your primary database down.
But in any case, i'm not sure you have to worry about locking -- I guess just depends on how you implement.