How to represent/read a cube in dxf file? - file-format

Trying to open a dxf file format (actual project in C++), I could understand the basic structure of the file, but I can't manage to find how a cube is actually represented.
For a cube in CAD, I expect at least 9 values:
X Y Z position
A B C rotation
W H D size of the cube
I expect X, Y, Z, A, B and C to be in the ENTITY section,
But Looking at example files, I see many settings, the preview image (taking a significant space in the file), layouts, etc.. But nothing that I can match to how the cube is actually build.
Question:
How to represent/read a cube in a dxf file?
More info
Here is the documentation about the file format:
http://help.autodesk.com/view/OARX/2018/ENU/?guid=GUID-235B22E0-A567-4CF6-92D3-38A2306D73F3
Here is an example of file with a cube (created with BricsCAD). Unfortunately, I can't embed the file, as it is too big.
The "cube" is of 20mm x 25mm x 30mm.
https://download.escain.org/example_cube_20_25_30_mm.dxf
I checked LibreCAD source-code, but it does not manage 3D models. Also, libdxfrw library is too generic (it just call the interface callback with the full Entity data).
https://github.com/LibreCAD/LibreCAD_3
https://github.com/LibreCAD/libdxfrw

This cube is embedded binary ACIS data (3DSOLID) and can not be interpreted without the libraries from Spatial Inc. For more information see my answer to another question:
How I can parse nurbs surface from dxf file? Or do you know library(for js, if exists or any other language) for parsing it?
EDIT: Find binary data of ACIS entities
Starting with R2013/AC1027 Modeler Geometry of ACIS data is stored in the section ACDSDATA in a ACDSRECORD these records have no handle, instead they have an ID. The record of your 3DSOLID starts at line 22393 and has the ID 10:
0
ACDSRECORD
90
1
2
AcDbDs::ID
280
10
320
D2 <<< handle to 3DSOLID
2
ASM_Data
280
15
94
9259 <<< size in bytes
310
41534D2042696E61... <<< binary data as multiple tags of group code 310
This is your 3DSOLD with handle D2 which starts at line 2187:
0
3DSOLID
5
D2 <<< handle of your 3DSOLID
330
1F
100
AcDbEntity
8
0
100
AcDbModelerGeometry
290
0
2
{00000000-0000-0000-0000-000000000000}
100
AcDb3dSolid
350
0
As you see there is no association from the 3DSOLID to the binary content as ACDSRECORD in the ACDSDATA section.
I have no knowledge of a table (DICTIONARY) that links this data together. The only way I know is to search all ACDSRECORD in the ACDSDATA section for links (group code 320) to ACIS objects.
FYI: In DXF versions prior to R2013 the ACIS data is stored in the entity itself as ascii text with a lousy xor "encryption". All my Knowledge about the DXF format is baked into my Python package: ezdxf.

Related

Unable to decode all information from Oracle RAW data

I have an application where I can upload files and add metadata to the file. This metadata information is stored in a database, but parts of the added information is encoded somehow (sadly I have no access to the source code).
The raw representation of the metadata in the Oracle database is as follows:
00000009010000000000000000512005B69801505B000000010000000700000040000000010000000A0100000006496D616765000000003C000000010000000A010000000A696D6167652F706E670000000027000000030000000501000000010000000500000001010000000B64653A3132332E706E6700000002A8000000030000000501000000030000000700000001010000000E737461636B6F766572666C6F770000000042000000010000000A010000001844433078303166363565396420307830303033336433640000000A2600000001000000020100033D3D0000003E000000010000000A0100000021346266653539343939343631356333323861613736313431636337346134353900
Whereas the raw sequence
737461636B6F766572666C6F77
corresponds to
stackoverflow
The query
select UTL_RAW.CAST_TO_VARCHAR2(<raw_data>) from dual;
returns the string below:
Here the values of the metadata are shown. But the names/identifier of the properties are unreadable. The corresponding name/identifier of stackoverflow should be test or a foreign key to a table that contains test. The other data contains additional information about the file (like the checksum, title or mime type)
Is it possible to retrieve the unreadable data (identifier) from the raw string?
RAW columns are not always containing a string, since the results it looks like that the content is binary data, more exactly a jpg file which has a string header in it but among binary information.
Converting it to a varchar will generate invalid charcode that are represented as rectangular boxes.
What you are doing here with varchar is the equivalent of opening a binary file, i.e a winword.doc or even a .jpeg by using Notepad.
To be able to get the content you need to treat it as image, not as varchar.
You can obtain the jpg file by using PLSQL as described here:
http://www.dba-oracle.com/t_extract_jpg_image_photo_sql_file.htm
Eventually it is possible to get all the content without loss in a char datatype using the following:
select RAWTOHEX(<raw_data>) from dual;
This will return the whole content as character value containing its hexadecimal equivalent and should not present any invalid ANSI character which is rapresented with a rectangular box.
Indeed you will not be able to read anymore "stackoverflow" or any other text since you will get only a sequence of HEX values.
You will need then from your program to convert it to binary/image and treat it properly.
Both "A01" and "101" are used to preface a 4 byte length followed by the Text, which is null terminated
00000009 010000000000000000512005B69801505B000000010000000700000040000000010000000A01
00000006 496D61676500 Image
0000003C 000000010000000A01
0000000A 696D6167652F706E6700 image/png
00000027 00000003000000050100000001000000050000000101
0000000B 64653A3132332E706E6700 de:123.png
000002A8 00000003000000050100000003000000070000000101
0000000E 737461636B6F766572666C6F7700 stackoverflow
00000042 000000010000000A01
00000018 444330783031663635653964203078303030333364336400
D C 0 x 0 1 f 6 5 e 9 d 0 x 0 0 0 3 3 d 3 d
00000A26 00000001000000020100033D3D0000003E000000010000000A01
00000021 346266653539343939343631356333323861613736313431636337346134353900
4 b f e 5 9 4 9 9 4 6 1 5 c 3 2 8 a a 7 6 1 4 1 c c 7 4 a 4 5 9

Retrieving raster data by geographic location using Landsat and PostGIS

The project I am working on requires that I retrieve Landsat raster data at specific geographic (lon/lat) locations. After sifting through some tutorials and experimenting with GDAL, PostGIS, and QGIS, I successfully imported a GeoTIFF Landsat image into a PostGIS raster table and accessed values by geographic location from that table. However, there were a few issues in the result:
I do not understand the coordinate system being used by QGIS in its interface, as they range in the hundred thousands
The raster loaded into QGIS off the coast of Spain, rather than on top of Maine, USA as it was supposed to.
Here's some information about my process. I am fairly new to GIS in general, so I am almost certain theres a blatant error to be found here:
Download Landsat 8 GeoTIFF file from USGS GloVis
Rename the band 5 image to something more friendly to command ninja with.
Create postgres database for raster tables and run CREATE EXTENSION postgis;
Run gdalinfo LSSampleB5.TIF, printing the following output:
Driver: GTiff/GeoTIFF
Files: LSSampleB5Test2.TIF
Size is 7871, 7971
Coordinate System is:
PROJCS["WGS 84 / UTM zone 19N",
GEOGCS["WGS 84",
DATUM["WGS_1984",
SPHEROID["WGS 84",6378137,298.257223563,
AUTHORITY["EPSG","7030"]],
AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich",0,
AUTHORITY["EPSG","8901"]],
UNIT["degree",0.0174532925199433,
AUTHORITY["EPSG","9122"]],
AUTHORITY["EPSG","4326"]],
PROJECTION["Transverse_Mercator"],
PARAMETER["latitude_of_origin",0],
PARAMETER["central_meridian",-69],
PARAMETER["scale_factor",0.9996],
PARAMETER["false_easting",500000],
PARAMETER["false_northing",0],
UNIT["metre",1,
AUTHORITY["EPSG","9001"]],
AXIS["Easting",EAST],
AXIS["Northing",NORTH],
AUTHORITY["EPSG","32619"]]
Origin = (318285.000000000000000,5216715.000000000000000)
Pixel Size = (30.000000000000000,-30.000000000000000)
Metadata:
AREA_OR_POINT=Point
Image Structure Metadata:
INTERLEAVE=BAND
Corner Coordinates:
Upper Left ( 318285.000, 5216715.000) ( 71d23'37.53"W, 47d 4'44.12"N)
Lower Left ( 318285.000, 4977585.000) ( 71d18' 9.77"W, 44d55'42.53"N)
Upper Right ( 554415.000, 5216715.000) ( 68d16'58.41"W, 47d 6' 6.11"N)
Lower Right ( 554415.000, 4977585.000) ( 68d18'36.69"W, 44d56'58.62"N)
Center ( 436350.000, 5097150.000) ( 69d49'20.56"W, 46d 1'29.87"N)
Band 1 Block=7871x1 Type=UInt16, ColorInterp=Gray
I interpretted this output as EPSG 4326 format (which may be my crime), so I ran the following command to import the GeoTIFF as a PostGIS raster:
raster2pgsql -s 4326 -I LSSampleB5.TIF -F -t 50x50 -d | psql -U postgres rastertest
This successfully imported a new table. I then used QGIS to get a visual intuition of what was going on.
Under Database -> DB Manager -> PostGIS -> rastertest -> public I added my lssampleb5 to the canvas.
I created a new XYZ Connection in QGIS to add Google satillite hybrid images for reference. The url I used was https://mt1.google.com/vt/lyrs=y&x={x}&y={y}&z={z} with min and max zoom of 0 and 19 respectively.
Here is where I took note of the fact that the lssample layer landed off the coast of Spain on the Google Hybrid map.
I made sure both layers were on EPSG 4326 projection, no change.
Not too discouraged to move on, I tried a database query to get a single pixel value. Since my sample data landed near Spain, I used QGIS to sample a valid coordinate pair near there for the query. The query was:
SELECT rid, ST_Value(rast, 1, ST_SetSRID(ST_Point(448956,5041439), 4326)) as b5
FROM lssampleb5
WHERE ST_Intersects(rast, ST_SetSRID(ST_Point(448956,5041439), 4326)::geometry, 1);
This returned a valid row ID and an ST_VALUE of 5776. Trying coordinates outside the range displayed by QGIS resulted in no returned entries, which isn't unexpected.
So, first of all, I do not know what QGIS is using for its coordinate system. It's definitely not longitude and latitude in a raw form, but from my understanding, EPSG 4326 is supposed to be a geographic projection.
Second, I don't know why QGIS is misplacing the Landsat scene in the wrong place, or where in the process the scene was not transformed properly.
Join us at GIS SE, that´s the place for GIS related Q/A!
To help you out here:
Indeed, your crime was the CRS. The top level PROJCRS tag is the
key here, it reads out "WGS 84 / UTM zone 19N" from the data, with
the EPSG reference at the bottom (AUTHORITY["EPSG","32619"]]).EPSG:32619 is a UTM projected CRS based on the WGS84 geoid (datum) and with units in meter, defined as the projected distance to the corresponding reference meridians (Easting) and the equator (Northing). Since you defined the wrong CRS during import (i.e. EPSG:4326), the inherent coordinate values of the raster were treated as degree´s and the whole thing placed to the other end of the world. Run UpdateRasterSRID (SELECT UpdateRasterSRID(<shema_name>, <your_raster_table>, rast, 32619);) to set the raster's metadata to the correct CRS and reload the layer.
As for QGIS: it uses the CRS that you tell it to use. QGIS comes with a very handy on-the-fly reprojection feature (OTF, check out the general man page on 'working with projections' here) that lets you define an arbitrary CRS for your data to be projected and displayed (i.e. it reprojects the data's CRS into the defined one in memory, the data's metadata stays untouched).You can find the quick link button to the OTF settings in the bottom right corner of the GUI; set it to your desired SRID (e.g. 4326) (you´ll notice how the visual representation of your data changes according to the chosen projection. also the displayed coordinates will use the CRS units, e.g. decimal degrees for WGS84).

How to create runtime variable for reading csv file header using Pandas

I have a csv file .It logs some data depending upon the test condition.
The header file of this csv file is like below
UTC Time(s) SVID-1 Constel-1 Status-1 Zij-1 SVID-2 Constel-2 Status-2
10102 1 G P 0 2 G P
Zij-2 SVID-3 Constel-3 Status-3 Zij-3 .......
0.3 3 G A --
.....
Apart from UTC Time column, other columns may increase or decrease depending
upon test condition or number of satellites I use.
If any extra satellite introduces or reduces then corresponding Svid,Constel, Status,Zij will be present or will not be there.
I am interested to know whether is it possible to create runtime variable for each column without looking into csv file header.

How to store mathematical expressions/explanations into database

I am given a task to develop a website for maths students with questions and their explanations.The site will have around 20,000 questions.And I need an effective way(easy storage,faster querying and fast rendering) to store those questions into the database.
Sample Question
In the first 10 overs of a cricket game, the run rate was only 3.2. What should be the run rate in the remaining 40 overs to reach the target of 282 runs?
Required run rate = 282 - (3.2 x 10) = 250 = 6.25
---------------- -----
40 40
Questions is a simple string and can easily be stored.But the real problem is to store those expressions with brackets and divide into the database?
You could store the expressions in LaTeX in the database.
Edit:
You can use libraries like http://www.mathjax.org/ for client-side rendering of the equations.
You have several options to store a string representation of mathematical expressions: MathML, LaTeX or ASCIIMathML.
For displaying it in a web browser I recommend MathJax.

How to order the ngrams in Google's database (or the one hosted on AWS) by frequency

I'm looking for a way to order Google Book's Ngram's by frequency.
The original dataset is here: http://books.google.com/ngrams/datasets. Inside each file the ngrams are sorted alphabetically and then chronologically.
My computer is not powerful enough to handle 2.2 TB worth of data, so I think the only way to sort this would be "in the cloud".
The AWS-hosted version is here: http://aws.amazon.com/datasets/8172056142375670.
Is there a financially efficient way to find the 10,000 most frequent 1grams, 2grams, 3grams, 4grams, and 5grams?
To throw a wrench in it, the datasets contain data for multiple years:
As an example, here are the 30,000,000th and 30,000,001st lines from file 0
of the English 1-grams (googlebooks-eng-all-1gram-20090715-0.csv.zip):
circumvallate 1978 313 215 85
circumvallate 1979 183 147 77
The first line tells us that in 1978, the word "circumvallate" (which means
"surround with a rampart or other fortification", in case you were wondering)
occurred 313 times overall, on 215 distinct pages and in 85 distinct books
from our sample.
Ideally, the frequency lists would only contain data from 1980-present (the sum of each year).
Any help would be appreciated!
Cheers,
I would recommend using Pig!
Pig makes things like this very easy and straight-forward. Here's a sample pig script that does pretty much what you need:
raw = LOAD '/foo/input' USING PigStorage('\t') AS (ngram:chararray, year:int, count:int, pages:int, books:int);
filtered = FILTER raw BY year >= 1980;
grouped = GROUP filtered BY ngram;
counts = FOREACH grouped GENERATE group AS ngram, SUM(filtered.count) AS count;
sorted = ORDER counts BY count DESC;
limited = LIMIT sorted 10000;
STORED limited INTO '/foo/output' USING PigStorage('\t');
Pig on AWS Elastic MapReduce can even operate directly on S3 data, so you would probably replace /foo/input and /foo/output with S3 buckets too.

Resources