I am trying to convert the Exif longitude and latitude data into a Google friendly decimal value. I have read a lot of tutorials, but nothing seems to be working for me. I may well be doing this completely wrong.
I successfully get the Exif data as follows:
$exif_data = exif_read_data($path . '/user/images/4.jpg');
$egeoLong = $exif_data['GPSLongitude'];
$egeoLat = $exif_data['GPSLatitude'];
$egeoLongR = $exif_data['GPSLongitudeRef'];
$egeoLatR = $exif_data['GPSLatitudeRef'];
From there I end up with the following array for the values:
deg Long: 118/1
min Long: 2147/100
sec Long: 0/1
Longitude Ref: W
deg Lat: 34/1
min Lat: 433/100
sec Lat: 0/1
Latitude Ref: N
And I run them through this function to obtain a decimal value (I got this function here on Stackoverflow).
$geoLong = gpsDecimal($egeoLong[0], $egeoLong[1], $egeoLong[2], $egeoLongR);
$geoLat = gpsDecimal($egeoLat[0], $egeoLat[1], $egeoLat[2], $egeoLatR);
function gpsDecimal($deg, $min, $sec, $hemi) {
$d = $deg+((($min*60)+($sec))/3600);
return ($hemi=='S' || $hemi=='W') ? $d*=-1 : $d;
}
Which returns the following... that puts the coordinates in the middle of the ocean... not where I took the photo.
Latitude: 41.216666666667
Longitude: -153.78333333333
Can anyone tell me what I am doing wrong, or a better way to achieve this. Any help is much appreciated.
I found that same formula, and it didn't work for me either. What I finally figured out, and what worked for me, was:
function toDecimal($deg, $min, $sec, $hem)
{
$d = $deg + ((($min/60) + ($sec/3600))/100);
return ($hem=='S' || $hem=='W') ? $d*=-1 : $d;
}
I found that the best approximization is (based on coordinate translated by google maps)
$deg + ($min/60) + ($sec/36000000)
Related
When I am trying to add two numbers, I am able to get NaN. If anyone has any advice, any help would be greatly appreciated.
This is my Code:-
var sum = parseInt(label.labellength, 10) + parseInt(label.labely, 10);
console.log("Sum of FrontRight is " + sum );
Output is:-NaN
I tried as following :
Using Number
var sum=Number(label.labellength) + Number(label.labely);
Output is:-NaN
The same issue came to me , try to use || operator and return it, it did the same and it worked for me...
const [contextOverHr,setContextOverHr=useState("")
return (
{Object.values(workHour).reduce((pre,total) => {
let totalNum = pre + total.OverHour
setContextOverHr (totalNum)
return totalNum || 0
},0)
}
)
I could able to interpolate and plot chlorophyll concentration with respective x ,y. Now I need to display within a specific range of values. For example value between 0 and 10 as in the code below
ncfile_1 = 'A20150322015059.L3m_MO_CHL_chlor_a_4km.nc';
lat_1 = ncread(ncfile_1,'lat') ;
lon_1 = ncread(ncfile_1,'lon') ;
chlor_a_1 = ncread(ncfile_1,'chlor_a');
[X_1,Y_1] = meshgrid(lon_1,lat_1) ;
xi_1 = linspace(30,100,1000) ;
yi_1 = linspace(0,30,1000) ;
[Xi_1,Yi_1] = meshgrid(xi_1,yi_1);
iwant_1 = interp2(X_1,Y_1,chlor_a_1',Xi_1,Yi_1)' ;
pcolor(xi_1,yi_1,iwant_1'); shading interp;
c = colorbar;
cmap = jet(255);
cmap(:,3) = 0;
colormap(cmap)
caxis([0, 10])
I could plot figure but the levels in the plot seems wrong. Please point out the mistake in my code.
I have tried with this code but doesn't work
if chlor_a_1>=0 && chlor_a_1<=10
disp(chlor_a_1);
[Xi_1,Yi_1,chlor_a_1] = peaks;
plot(Xi_1,Yi_1,chlor_a_1,49)
I am a beginner on Matlab. Any help is highly appreciated.
Looking for some kind of solution to this issue:
trying to create a tensor from an array of timestamps
[
1612892067115,
],
but here is what happens
tf.tensor([1612892067115]).arraySync()
> [ 1612892078080 ]
as you can see, the result is incorrect.
Somebody pointed out, I may need to use the datatype int64, but this doesn't seem to exist in tfjs ðŸ˜
I have also tried to divide my timestamp to a small float, but I get a similar result
tf.tensor([1.612892067115, 1.612892068341]).arraySync()
[ 1.6128920316696167, 1.6128920316696167 ]
If you know a way to work around using timestamps in a tensor, please help :)
:edit:
As an attempted workaround, I tried to remove my year, month, and date from my timestamp
Here are my subsequent input values:
[
56969701,
56969685,
56969669,
56969646,
56969607,
56969602
]
and their outputs:
[
56969700,
56969684,
56969668,
56969648,
56969608,
56969600
]
as you can see, they are still incorrect, and should be well within the acceptable range
found a solution that worked for me:
Since I only require a subset of the timestamp (just the date / hour / minute / second / ms) for my purposes, I simply truncate out the year / month:
export const subts = (ts: number) => {
// a sub timestamp which can be used over the period of a month
const yearMonth = +new Date(new Date().getFullYear(), new Date().getMonth())
return ts - yearMonth
}
then I can use this with:
subTimestamps = timestamps.map(ts => subts(ts))
const x_vals = tf.tensor(subTimestamps, [subTimestamps.length], 'int32')
now all my results work as expected.
Currently only int32 is supported with tensorflow.js, your data has gone out of the range supported by int32.
Until int64 is supported, this can be solved by using a relative timestamp. Currently a timestamp in js uses the number of ms that elapsed since 1 January 1970. A relative timestamp can be used by using another origin and compute the difference of ms that has elapsed since that date. That way, we will have a lower number that can be represented using int32. The best origin to take will be the starting date of the records
const a = Date.now() // computing a tensor out of it will give an accurate result since the number is out of range
const origin = new Date("02/01/2021").now()
const relative = a - origin
const tensor = tf.tensor(relative, undefined, 'int32')
// get back the data
const data = tensor.dataSync()[0]
// get the initial date
const initial date = new Date(data + origin)
In other scenarios, if using the ms is not of interest, using the number of s that has elapsed since the start would be better. It is called the unix time
I'm trying to read an input file in Scala that I know the structure of, however I only need every 9th entry. So far I have managed to read the whole thing using:
val lines = sc.textFile("hdfs://moonshot-ha-nameservice/" + args(0))
val fields = lines.map(line => line.split(","))
The issue, this leaves me with an array that is huge (we're talking 20GB of data). Not only have I seen myself forced to write some very ugly code in order to convert between RDD[Array[String]] and Array[String] but it's essentially made my code useless.
I've tried different approaches and mixes between using
.map()
.flatMap() and
.reduceByKey()
however nothing actually put my collected "cells" into the format that I need them to be.
Here's what is supposed to happen: Reading a folder of text files from our server, the code should read each "line" of text in the format:
*---------*
| NASDAQ: |
*---------*
exchange, stock_symbol, date, stock_price_open, stock_price_high, stock_price_low, stock_price_close, stock_volume, stock_price_adj_close
and only keep a hold of the stock_symbol as that is the identifier I'm counting. So far my attempts have been to turn the entire thing into an array only collect every 9th index from the first one into a collected_cells var. Issue is, based on my calculations and real life results, that code would take 335 days to run (no joke).
Here's my current code for reference:
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object SparkNum {
def main(args: Array[String]) {
// Do some Scala voodoo
val sc = new SparkContext(new SparkConf().setAppName("Spark Numerical"))
// Set input file as per HDFS structure + input args
val lines = sc.textFile("hdfs://moonshot-ha-nameservice/" + args(0))
val fields = lines.map(line => line.split(","))
var collected_cells:Array[String] = new Array[String](0)
//println("[MESSAGE] Length of CC: " + collected_cells.length)
val divider:Long = 9
val array_length = fields.count / divider
val casted_length = array_length.toInt
val indexedFields = fields.zipWithIndex
val indexKey = indexedFields.map{case (k,v) => (v,k)}
println("[MESSAGE] Number of lines: " + array_length)
println("[MESSAGE] Casted lenght of: " + casted_length)
for( i <- 1 to casted_length ) {
println("[URGENT DEBUG] Processin line " + i + " of " + casted_length)
var index = 9 * i - 8
println("[URGENT DEBUG] Index defined to be " + index)
collected_cells :+ indexKey.lookup(index)
}
println("[MESSAGE] collected_cells size: " + collected_cells.length)
val single_cells = collected_cells.flatMap(collected_cells => collected_cells);
val counted_cells = single_cells.map(cell => (cell, 1).reduceByKey{case (x, y) => x + y})
// val result = counted_cells.reduceByKey((a,b) => (a+b))
// val inmem = counted_cells.persist()
//
// // Collect driver into file to be put into user archive
// inmem.saveAsTextFile("path to server location")
// ==> Not necessary to save the result as processing time is recorded, not output
}
}
The bottom part is currently commented out as I tried to debug it, but it acts as pseudo-code for me to know what I need done. I may want to point out that I am next to not at all familiar with Scala and hence things like the _ notation confuse the life out of me.
Thanks for your time.
There are some concepts that need clarification in the question:
When we execute this code:
val lines = sc.textFile("hdfs://moonshot-ha-nameservice/" + args(0))
val fields = lines.map(line => line.split(","))
That does not result in a huge array of the size of the data. That expression represents a transformation of the base data. It can be further transformed until we reduce the data to the information set we desire.
In this case, we want the stock_symbol field of a record encoded a csv:
exchange, stock_symbol, date, stock_price_open, stock_price_high, stock_price_low, stock_price_close, stock_volume, stock_price_adj_close
I'm also going to assume that the data file contains a banner like this:
*---------*
| NASDAQ: |
*---------*
The first thing we're going to do is to remove anything that looks like this banner. In fact, I'm going to assume that the first field is the name of a stock exchange that start with an alphanumeric character. We will do this before we do any splitting, resulting in:
val lines = sc.textFile("hdfs://moonshot-ha-nameservice/" + args(0))
val validLines = lines.filter(line => !line.isEmpty && line.head.isLetter)
val fields = validLines.map(line => line.split(","))
It helps to write the types of the variables, to have peace of mind that we have the data types that we expect. As we progress in our Scala skills that might become less important. Let's rewrite the expression above with types:
val lines: RDD[String] = sc.textFile("hdfs://moonshot-ha-nameservice/" + args(0))
val validLines: RDD[String] = lines.filter(line => !line.isEmpty && line.head.isLetter)
val fields: RDD[Array[String]] = validLines.map(line => line.split(","))
We are interested in the stock_symbol field, which positionally is the element #1 in a 0-based array:
val stockSymbols:RDD[String] = fields.map(record => record(1))
If we want to count the symbols, all that's left is to issue a count:
val totalSymbolCount = stockSymbols.count()
That's not very helpful because we have one entry for every record. Slightly more interesting questions would be:
How many different stock symbols we have?
val uniqueStockSymbols = stockSymbols.distinct.count()
How many records for each symbol do we have?
val countBySymbol = stockSymbols.map(s => (s,1)).reduceByKey(_+_)
In Spark 2.0, CSV support for Dataframes and Datasets is available out of the box
Given that our data does not have a header row with the field names (what's usual in large datasets), we will need to provide the column names:
val stockDF = sparkSession.read.csv("/tmp/quotes_clean.csv").toDF("exchange", "symbol", "date", "open", "close", "volume", "price")
We can answer our questions very easy now:
val uniqueSymbols = stockDF.select("symbol").distinct().count
val recordsPerSymbol = stockDF.groupBy($"symbol").agg(count($"symbol"))
I'm trying to follow a document that has some code on text mining clustering analysis.
I'm fairly new to R and the concept of text mining/clustering so please bear with me if i sound illiterate.
I create a simple matrix called dtm and then run kmeans to produce 3 clusters. The code im having issues is where a function has been defined to get "five most common words of the documents in the cluster"
dtm0.75 = as.matrix(dt0.75)
dim(dtm0.75)
kmeans.result = kmeans(dtm0.75, 3)
perClusterCounts = function(df, clusters, n)
{
v = sort(colSums(df[clusters == n, ]),
decreasing = TRUE)
d = data.frame(word = names(v), freq = v)
d[1:5, ]
}
perClusterCounts(dtm0.75, kmeans.result$cluster, 1)
Upon running this code i get the following error:
Error in colSums(df[clusters == n, ]) :
'x' must be an array of at least two dimensions
Could someone help me fix this please?
Thank you.
I can't reproduce your error, it works fine for me. Update your question with a reproducible example and you might get a more useful answer. Perhaps your input data object is empty, what do you get with dim(dtm0.75)?
Here it is working fine on the data that comes with the tm package:
library(tm)
data(crude)
dt0.75 <- DocumentTermMatrix(crude)
dtm0.75 = as.matrix(dt0.75)
dim(dtm0.75)
kmeans.result = kmeans(dtm0.75, 3)
perClusterCounts = function(df, clusters, n)
{
v = sort(colSums(df[clusters == n, ]),
decreasing = TRUE)
d = data.frame(word = names(v), freq = v)
d[1:5, ]
}
perClusterCounts(dtm0.75, kmeans.result$cluster, 1)
word freq
the the 69
and and 25
for for 12
government government 11
oil oil 10