how to split huge DBF file?

how to split huge DBF file? - sybase

The task is to split huge DBF file from very old ERP system.
Size of the files are:
table1.dbf - 5307 MB
table1.cdx - 288 MB
table1.fpt - 617 MB
I do not understand how it worked on 32-bit windows... but it works.
Have you an idea how to split the table for 2 files?
for example 10% of last records in one file and 90% of oldest records in another file.
It is better to complete the task in the my environment (Win10 64 bit).

The bottom line here is that if table1.dbf is > 2GB then it was not Visual FoxPro that was working with it, and the Visual Foxpro OLEDB driver will not be able to read it.
it is more likely to have been SAP's Advantage Database Server
So I would investigate getting hold of an ODBC\OLEDB driver for that and using it to extract data.

USE table1
Locate the row which you want split from (you will get split from the located record to the last record)
COPY TO table2 REST
Make sure the table2 has been generated. If table2 has been created, then locate the same row in table1 again.
DELETE REST

Related

Error: The buffer manager cannot extend the file "C:\Users\user_name\... .tmp" to length 99996168 bytes. There was insufficient disk space

We are running a package which is run on around 1 million of records and the package uses multiple sorts and joins. The package is getting aborted with the below error-
Error: The buffer manager cannot extend the file "C:\Users\ user_name \AppData\Local\Temp\2\DTS{DCC58934-0E3F-4E9D-9858-E2E0A39002A2}.tmp" to length 99996168 bytes.
There was insufficient disk space.
We have tried increasing the DefaultBufferSize to 100 MB and the C:\ drive also has free space of around 10 GB. We have also tried including a batch process which would clear the \tmp directory before the package starts. But the issue persists.
The user who is running the package has admin rights on the system. We haven't yet tried setting the BufferTempStoragePath. Please suggest with other workarounds.
Thanks!

I have encountered the same error. In my case the reference dataset have 4+ million data and then comparing with source data. I am using lookup transformation control with full cache. When package executes memory gets full and error appears. What I understood with this behavior is that when memory gets full, virtual memory assist and provide some more space to load buffer but unfortunately virtual memory also fail to help more; SSIS package gets crash and error appears as mentioned in tile.
Initial query used in Lookup transformation control: Select customerid, customercode from customerBase(nolock) /when crm get loaded everytime record size increases using same package/
I then try to minimize to load of reference dataset in a lookup transformation so that among 4+ million, only pull limited recordset and load it into memory easily.
Modify the query :Select customerid, customercode from customerBase(nolock) cst inner join cityBase(nolock) cty on cst.cityid=cty.cityid where cityname='Karachi'
Note: It may vary in your scenario how to load minimum reference dataset in memory using lookup transformation control of SSIS. Using such queries with full cache then ssis match source data with reference dataset.
Hope this approach may help to someone.
Thanks

How can i use SQLLoader to load data into my Database tables directly from a tar.gz file?

I am trying to load data into my oracle database table from an external tar.gz file. I can load data easily from a standard text file using SQLLoader but i'm not sure how to do the same if i have a tar.gz file instead of a word file.
I am found the following link somewhat helpful:
http://www.simonecampora.com/blog/2010/07/09/how-to-extract-and-load-a-whole-table-in-oracle-using-sqlplus-sqlldr-named-pipes-and-zipped-dumps-on-unix/
However the author of the link is using .dat.gz instead of .tar.gz. Is there anyway to load data into my Oracle database table using SQL loader from a tar.gz file instead of a text file?
Also, Part of the problem for me is that i'm supposed to load data from a NEW tar.gz file every hour into the same table. For e.g. In hour 1 i have file1.tar.gz and i load all its 10 rows of data into TABLE in my oracle database. In hour 2 i have file2.tar.gz and i have to load its 10 rows of data into the same TABLE in my oracle database. But the 10 rows extracted by SQLLoader in file2.tar.gz keep replacing the first 10 rows extracted from file1.tar.gz. Any way i can save the rows from file1.tar.gz as row 1-10 and file2.tar.gz rows as row 11-20 using SQL Loader?

The magic is in the "zcat" part. zcat can output from zipped files. Including tar.gz.
For example try: zcat yourfile.tar.gz and you will see output. In the example URL you provided, they're redirecting the output of zcat into a place that SQLLDR can read from.

Is there a faster way to delete the first x rows from a DBF?

I have been trying to use CDBFLite to delete records of a DBF file from records 1 to 5 million or so (in order to decrease the filesize). Due to factors beyond my control, this is something I will have to do every day. The filesize exceeds 2 GB.
However, it takes forever to run the delete commands. Is there a faster way to just eliminate the first X records of a DBF (and thus result in a smaller filesize) ?

As noted by Ethan, if a .DBF file, it typically caps at standard 32-bit OS capacity of 2-gig per single file when it comes to .DBFs unless you are dealing with another software engine such as SyBase Database Advantage which can read/write to .DBF files and exceed the 2 gig capacity.
That said, the DBF standard format has a single character on each record as a "flag" that the record is deleted, yet still retains the space. In order to reduce the size, you would need to PACK the file which actually REMOVES the deleted records and thus will reduce the file size back down.
Now Ethan has options via Python, and I via C#.net and using Microsoft Visual Foxpro OleDb Provider and can offer more, but don't know what you have access to.
If you have VFP (or dBASE) directly, then it should be as simple as getting to the command window and doing
USE [YourTable] exclusive
pack
But I would make a backup copy of the file first as simple precaution.

Here's a very rough outline using my dbf package:
import dbf
import shutil
database = r'\some\path\to\database.dbf'
backup = r'\some\backup\path\database.backup.dbf')
# make backup copy
shutil.copy(database, backup)
# open copy
backup = dbf.Table(backup)
# overwrite original
database = backup.new(database)
# copy over the last xxx records
with dbf.Tables(backup, database):
for record in backup[-10000:]:
database.append(record)
I suspect copying over the last however many records you want will be quicker than packing.

Import textfiles into linked SQL Server tables in Access

I have an Access (2010) database (front-end) with linked SQL Server-tables (back-end). And I need to import text files into these tables. These text files are very large (some have more than 200.000 records, and approx. 20 fields)
The problem is that I can't import the tex tfiles directly in the SQL tables. Some files contain empty lines at the start, and some other lines that I don't want to import in the tables.
So here's what I did in my Access database:
1) I created a link to the text files.
2) I also have a link to the SQL Server tables
3a) I created an Append-query that copies the records from the linked text file to the linked SQLServer table.
3b) I created a VBA-code that opens both tables, and copies the records from the text file in the SQL Server-table, record for record. (I tried it in different ways: with DAO and ADODB).
[Step 3a and 3b are two different ways how I tried to import the data. I use one of them, not both. I prefer option 3b, because I can run a counter in statusbar to see how many records needs to be imported at any moment; I can see how far he is.]
The problem is that it takes a lot of time to run it... and I mean a LOT of time: 3 hours for a file with 70.000 records and 20 fields!
When I do the same with an Access-table (from TXT to Access), it's much faster.
I have 15 tables like this (with even more records), and I need to make these imports every day. I run this procedure automatically every night (between 20:00 and 6:00).
Is there an easier way to do this?
What is the best way to do this?

This feels like a good case for SSIS to me.
You can create a data flow from a flat file (as the data source) to a SQL DB (as the destination).
You can add some validation or selection steps in between.
You can easily find tutorials like this one online.
Alternatively, you can do what Gord mentioned, and import the data from a text file into a local Access table and then using a single INSERT INTO LinkedTable SELECT * FROM LocalTable to copy the data to the SQL Server table.

What FoxPro data tools can I use to find corrupted data?

I have some SQL Server DTS packages that import data from a FoxPro database. This was working fine until recently. Now the script that imports data from one of the FoxPro tables bombs out about 470,000 records into the import. I'm just pulling the data into a table with nullable varchar fields so I'm thinking it must be a weird/corrupt data problem.
What tools would you use to track down a problem like this?
FYI, this is the error I'm getting:
Data for source column 1 ('field1') is not available. Your provider may require that all Blob columns be rightmost in the source result set.
There should not be any blob columns in this table.
Thanks for the suggestions. I don't know if it a corruption problem for sure. I just started downloading FoxPro from my MSDN Subscription, so I'll see if I can open the table. SSRS opens the table, it just chokes before running through all the records. I'm just trying to figure out which record it's having a problem with.

Cmrepair is an excellent freeware utility to repair corrupted .DBF files.

Have you tried writing a small program that just copies the existing data to a new table?
Also,
http://fox.wikis.com/wc.dll?Wiki~TableCorruptionRepairTools~VFP

My company uses Foxpro to store quite a bit of data... In my experience, data corruption is very obvious, with the table failing to open in the first place. Do you have a copy of foxpro to open the table with?

At 470,000 records you might want to check to see if you're approaching the 2 gigabyte limit on FoxPro table size. As I understand it, the records can still be there, but become inaccessible after the 2 gig point.

#Lance:
if you have access to Visual FoxPro command line window, type:
SET TABLEVALIDATE 11
USE "YourTable" EXCLUSIVE && If the table is damaged VFP must display an error here
PACK && To reindex the table and deleted "marked" records
PACK MEMO && If you have memo fields
After doing that, the structure of the table must ve valid, if you want to see fields with invalid data, you can try:
SELECT * FROM YourTable WHERE EMPTY(YourField) && All records with YourField empty
SELECT * FROM YourTable WHERE LEN(YourMemoField) > 200 && All records with a long memo field, there can be corrupted data
etc.

Use Repair Databases from my site (www.shershahsoft.com) for FREE (and Will always be FREE).
I have designed this program to repair damaged Foxpro/FoxBase/Dbase files. The program is very quick. It will repair 1 GB table in less than a minute.
You can asign files, and folders to the program. As you start the program it will mark all the corrupted files, and by clicking Repair or Check and Repair button, it will repair all the corrupted files. Moreover, it will create a folders "CorruptData" in the folders where the actual data exist, and will keep copies of the corrupt files there.
One thing to keep in mind, always run Windows CheckDsk on the drives where you store the files. Cause, when records are being copied to a table and power failure occures, there exists lost clusters which Windows converts to files during CheckDsk. After that, the RepairDatabases will do the job for you.
I have used many paid and free programs which repair tables, but all such programs leave extra records in the tables with embiguit characters (and they are time consuming too). The programer needs to find and delete such records manually. But Repair Databases actually recovers the original records, you need no further action. The only action you need is reindexing your files.
In the repair process some times File Open Dialog appears which asks to locate the compact index file for a table with indeces. You may click cancel the dialog at that point, the table will be repaired, however, you will need to reindex the file later. (this dialog may appear several times depending upon the number of corrupted indeces.)