Determine if path is inside another path in Go - file

I would like to delete all path components for a file up to (but not including) an overall base directory.
Example:
/overall/basedir/a/b/c/file
I want to remove "file" and then remove "c", "b" and then "a" if possible (directories not empty). I do not want to unlink "basedir" or "overall".
filepath.HasPrefix would seem to be a good option but it's apparently deprecated: https://golang.org/pkg/path/filepath/#HasPrefix
What I have now is:
p := THEPATH
// attempt to remove file and all parent directories up to the basedir
// FIXME: HasPrefix is apparently bad.. a better idea?
for filepath.HasPrefix(p, baseDir) {
err := os.Remove(p)
if err != nil {
break
}
// climb up one
p = filepath.Dir(p)
}
Looking for a succinct and reliable way that works on all Go supported platforms.

IMHO, path handling is rather complicated if you want to support all platforms that is supported by golang. Bellow is the solution that I've implemented so far (probably not the simplest one). Notes:
It supports generalized action rather than only os.Remove
Instead of string-based path comparison, function os.SameFile is used to test whether two files/directories are equal.
In the implementation, at first all candidate paths are visited and added to visitedPaths slice. Then, if no error occurs, an action is perform to each candidate path.
The code:
package pathwalker
import (
"os"
"path/filepath"
"strings"
)
type PathAction func(PathInfo) error
type PathInfo struct {
FileInfo os.FileInfo
FullPath string
}
type PathWalker struct {
pathName string
basePath string
visitedPaths []PathInfo
lastFi os.FileInfo
}
//NewPathWalker creates PathWalker instance
func NewPathWalker(pathName, basePath string) *PathWalker {
return &PathWalker{
pathName: pathName,
basePath: basePath,
}
}
func (w *PathWalker) visit() (bool, error) {
//Make sure path ends with separator
basePath := filepath.Clean(w.basePath + string(filepath.Separator))
baseInfo, err := os.Lstat(basePath)
if err != nil {
return false, err
}
//clean path name
fi, err := os.Lstat(w.pathName)
if err != nil {
return false, err
} else if fi.IsDir() {
//When pathname is a directory, remove latest separator
sep := string(filepath.Separator)
cleanPath := filepath.Clean(w.pathName + sep)
w.pathName = strings.TrimRight(cleanPath, sep)
} else {
w.pathName = filepath.Clean(w.pathName)
}
return w.doVisit(w.pathName, baseInfo)
}
//visit path recursively
func (w *PathWalker) doVisit(pathName string, baseInfo os.FileInfo) (bool, error) {
//Get file info
fi, err := os.Lstat(pathName)
if err != nil {
return false, err
}
//Stop when basePath equal to pathName
if os.SameFile(fi, baseInfo) {
return true, nil
}
//Top directory reached, but does not match baseInfo
if w.lastFi != nil && os.SameFile(w.lastFi, fi) {
return false, nil
}
w.lastFi = fi
//Append to visited path list
w.visitedPaths = append(w.visitedPaths, PathInfo{fi, pathName})
//Move to upper path
up := filepath.Dir(pathName)
if up == "." {
return false, nil
}
//Visit upper directory
return w.doVisit(up, baseInfo)
}
//Walk perform action then return number of proceed paths and error
func (w *PathWalker) Walk(act PathAction) (int, error) {
n := 0
ok, err := w.visit()
if err != nil {
return 0, err
} else if ok && act != nil {
for _, pi := range w.visitedPaths {
err := act(pi)
if err != nil {
return n, err
}
n++
}
}
return n, nil
}
//VisitedPaths return list of visited paths
func (w *PathWalker) VisitedPaths() []PathInfo {
return w.visitedPaths
}
Then if you want to remove file and parent directory under basePath, you can do:
func remove(pathName, basePath string) {
act := func(p pathwalker.PathInfo) error {
if p.FileInfo.IsDir() {
fmt.Printf(" Removing directory=%s\n", p.FullPath)
return os.Remove(p.FullPath)
}
fmt.Printf(" Removing file=%s\n", p.FullPath)
return os.Remove(p.FullPath)
}
pw := pathwalker.NewPathWalker(pathName, basePath)
n, err := pw.Walk(act)
fmt.Printf("Removed: %d/%d, err=%v\n", n, len(pw.VisitedPaths()), err)
}
If you just want to test whether a path is inside another path, you can do:
n, err := pathwalker.NewPathWalker(fileName, basePath).Walk(nil)
if n > 0 && err != nil {
//is inside another path
}

Related

How to return an array from reading a file?

I have two columns in a CSV file. I am accessing only the first column using the SearchData() function.
The problem is that I want to access the data as an array but when I return an array string in the AccessData() function and write the products[0] in the SearchData(), it gives me all the data by removing the bracket sign [] only and when I write products[1], it gives me runtime error: index out of range [1] with length 1.
Required result
products[0] = First Item
products[1] = Second Item
...
so on
Code
func AccessData(number int) string {
content, err := ioutil.ReadFile("products/data1.csv")
if err != nil {
log.Fatal(err)
}
Data := string(content)
sliceData := strings.Split(Data, ",")
return sliceData[number]
}
func SearchData(){
for i := 0; i <= 34; i = i + 2 {
products := AccessData(i)
fmt.Println(products)
}
}
This should do the trick:
func firstColumns(filename string) []string {
f, err := os.Open(filename)
if err != nil {
log.Fatal(err)
}
defer f.Close()
r := csv.NewReader(f)
var result []string
for {
row, err := r.Read()
if err != nil {
if err == io.EOF {
break
}
log.Fatal(err)
}
if len(row) > 0 {
result = append(result, row[0])
}
}
return result
}
func main() {
data := firstColumns("products/data1.csv")
fmt.Println(data)
fmt.Println(data[1])
}
This turns the the first column of every row into a []string which can be access index.
The output is:
[First item Second item]
Second item

How can go MD5 be so fast? crypto/md5

I need to compute the hash (md5 is ok) for a large number of files. So, in Go I have this code:
package main
import (
"io"
"os"
"fmt"
"path/filepath"
"crypto/md5"
"encoding/hex"
)
func strSliceRemove(slice []string, str string) []string {
var tempSlice []string;
for _, item := range slice {
if item != str {
tempSlice = append(tempSlice, item)
}
}
return tempSlice
}
func fileMD5(path string) (string, error) {
var returnMD5String string
file, err := os.Open(path)
if err != nil {
return returnMD5String, err
}
defer file.Close()
hash := md5.New()
if _, err := io.Copy(hash, file); err != nil {
return returnMD5String, err
}
hashInBytes := hash.Sum(nil)[:16]
returnMD5String = hex.EncodeToString(hashInBytes)
return returnMD5String, nil
}
func main() {
var doRead func(string)
doRead = func(sd string) {
filepath.Walk(sd, func(path string, f os.FileInfo, err error) error {
resolvedPath, resolvedPathErr := filepath.EvalSymlinks(path)
if resolvedPathErr != nil {
return nil
}
if f.Mode()&os.ModeSymlink == os.ModeSymlink {
doRead(resolvedPath)
} else {
if !f.IsDir() {
md5, _ := fileMD5(path)
fmt.Printf("%s\n", md5)
}
}
return nil
})
}
doRead("/tmp/electron")
return
}
It hashes correctly 1400 files in almost one second. If I use my OSX md5 command line utility, it takes more than 10 times the time. It is 10 times slower:
for FILE in `find /tmp/electron`; do
if [ ! -d "$FILE" ]; then
md5 $FILE;
fi;
done;
I tried a basic c program that does the same (based on this answer How to calculate the MD5 hash of a large file in C?) and still the time seems more or less 10 seconds.
What kind of strategy / library does crypto/md5 use?

decoding nested json objects in go

I found some posts on how to decoding json nested objects in go, I tried to apply the answers to my problem, but I only managed to find a partial solution.
My json file look like this:
{
"user":{
"gender":"male",
"age":"21-30",
"id":"80b1ea88-19d7-24e8-52cc-65cf6fb9b380"
},
"trials":{
"0":{"index":0,"word":"WORD 1","Time":3000,"keyboard":true,"train":true,"type":"A"},
"1":{"index":1,"word":"WORD 2","Time":3000,"keyboard":true,"train":true,"type":"A"},
},
"answers":{
"training":[
{"ans":0,"RT":null,"gtAns":"WORD 1","correct":0},
{"ans":0,"RT":null,"gtAns":"WORD 2","correct":0}
],
"test":[
{"ans":0,"RT":null,"gtAns":true,"correct":0},
{"ans":0,"RT":null,"gtAns":true,"correct":0}
]
}
}
Basically I need to parse the information inside it and save them into go structure. With the code below I managed to extract the user information, but it looks too complicated to me and it won't be easy to apply the same thing to the "answers" fields which contains 2 arrays with more than 100 entries each. Here the code I'm using now:
type userDetails struct {
Id string `json:"id"`
Age string `json:"age"`
Gender string `json:"gender"`
}
type jsonRawData map[string]interface {
}
func getJsonContent(r *http.Request) ( userDetails) {
defer r.Body.Close()
jsonBody, err := ioutil.ReadAll(r.Body)
var userDataCurr userDetails
if err != nil {
log.Printf("Couldn't read request body: %s", err)
} else {
var f jsonRawData
err := json.Unmarshal(jsonBody, &f)
if err != nil {
log.Printf("Error unmashalling: %s", err)
} else {
user := f["user"].(map[string]interface{})
userDataCurr.Id = user["id"].(string)
userDataCurr.Gender = user["gender"].(string)
userDataCurr.Age = user["age"].(string)
}
}
return userDataCurr
}
Any suggestions? Thanks a lot!
You're doing it the hard way by using interface{} and not taking advantage of what encoding/json gives you.
I'd do it something like this (note I assumed there was an error with the type of the "gtAns" field and I made it a boolean, you don't give enough information to know what to do with the "RT" field):
package main
import (
"encoding/json"
"fmt"
"io"
"log"
"strconv"
"strings"
)
const input = `{
"user":{
"gender":"male",
"age":"21-30",
"id":"80b1ea88-19d7-24e8-52cc-65cf6fb9b380"
},
"trials":{
"0":{"index":0,"word":"WORD 1","Time":3000,"keyboard":true,"train":true,"type":"A"},
"1":{"index":1,"word":"WORD 2","Time":3000,"keyboard":true,"train":true,"type":"A"}
},
"answers":{
"training":[
{"ans":0,"RT":null,"gtAns":true,"correct":0},
{"ans":0,"RT":null,"gtAns":true,"correct":0}
],
"test":[
{"ans":0,"RT":null,"gtAns":true,"correct":0},
{"ans":0,"RT":null,"gtAns":true,"correct":0}
]
}
}`
type Whatever struct {
User struct {
Gender Gender `json:"gender"`
Age Range `json:"age"`
ID IDString `json:"id"`
} `json:"user"`
Trials map[string]struct {
Index int `json:"index"`
Word string `json:"word"`
Time int // should this be a time.Duration?
Train bool `json:"train"`
Type string `json:"type"`
} `json:"trials"`
Answers map[string][]struct {
Answer int `json:"ans"`
RT json.RawMessage // ??? what type is this
GotAnswer bool `json:"gtAns"`
Correct int `json:"correct"`
} `json:"answers"`
}
// Using some custom types to show custom marshalling:
type IDString string // TODO custom unmarshal and format/error checking
type Gender int
const (
Male Gender = iota
Female
)
func (g *Gender) UnmarshalJSON(b []byte) error {
var s string
err := json.Unmarshal(b, &s)
if err != nil {
return err
}
switch strings.ToLower(s) {
case "male":
*g = Male
case "female":
*g = Female
default:
return fmt.Errorf("invalid gender %q", s)
}
return nil
}
func (g Gender) MarshalJSON() ([]byte, error) {
switch g {
case Male:
return []byte(`"male"`), nil
case Female:
return []byte(`"female"`), nil
default:
return nil, fmt.Errorf("invalid gender %v", g)
}
}
type Range struct{ Min, Max int }
func (r *Range) UnmarshalJSON(b []byte) error {
// XXX could be improved
_, err := fmt.Sscanf(string(b), `"%d-%d"`, &r.Min, &r.Max)
return err
}
func (r Range) MarshalJSON() ([]byte, error) {
return []byte(fmt.Sprintf(`"%d-%d"`, r.Min, r.Max)), nil
// Or:
b := make([]byte, 0, 8)
b = append(b, '"')
b = strconv.AppendInt(b, int64(r.Min), 10)
b = append(b, '-')
b = strconv.AppendInt(b, int64(r.Max), 10)
b = append(b, '"')
return b, nil
}
func fromJSON(r io.Reader) (Whatever, error) {
var x Whatever
dec := json.NewDecoder(r)
err := dec.Decode(&x)
return x, err
}
func main() {
// Use http.Get or whatever to get an io.Reader,
// (e.g. response.Body).
// For playground, substitute a fixed string
r := strings.NewReader(input)
// If you actually had a string or []byte:
// var x Whatever
// err := json.Unmarshal([]byte(input), &x)
x, err := fromJSON(r)
if err != nil {
log.Fatal(err)
}
fmt.Println(x)
fmt.Printf("%+v\n", x)
b, err := json.MarshalIndent(x, "", " ")
if err != nil {
log.Fatal(err)
}
fmt.Printf("Re-marshalled: %s\n", b)
}
Playground
Of course if you want to reuse those sub-types you could pull them out of the "Whatever" type into their own named types.
Also, note the use of a json.Decoder rather than reading in all the data ahead of time. Usually try and avoid any use of ioutil.ReadAll unless you really need all the data at once.

GAE Go — How to use GetMulti with non-existent entity keys?

I've found myself needing to do a GetMulti operation with an array of keys for which some entities exist, but some do not.
My current code, below, returns an error (datastore: no such entity).
err := datastore.GetMulti(c, keys, infos)
So how can I do this? I'd use a "get or insert" method, but there isn't one.
GetMulti can return a appengine.MultiError in this case. Loop through that and look for datastore.ErrNoSuchEntity. For example:
if err := datastore.GetMulti(c, keys, dst); err != nil {
if me, ok := err.(appengine.MultiError); ok {
for i, merr := range me {
if merr == datastore.ErrNoSuchEntity {
// keys[i] is missing
}
}
} else {
return err
}
}
I know this topic is up for more than a few days, but I like to post an alternative, using type switch.
if err := datastore.GetMulti(c, keys, dst); err != nil {
switch errt := err.(type) {
case appengine.MultiError:
for ix, e := range errt {
if e == datastore.ErrNoSuchEntity {
// keys[ix] not found
} else if e != nil {
// keys[ix] have error "e"
}
}
default:
// datastore returned an error that is not a multi-error
}
}
Thought I'd throw my answer in to display another usecase. The following will take in any number of keys and return all the valid keys only.
// Validate keys
var validKeys []*ds.Key
if err := c.DB.GetMulti(ctx, tempKeys, dst); err != nil {
if me, ok := err.(ds.MultiError); ok {
for i, merr := range me {
if merr == ds.ErrNoSuchEntity {
continue
}
validKeys = append(validKeys, tempKeys[i])
}
} else {
return "", err
}
} else {
// All tempKeys are valid
validKeys = append(validKeys, tempKeys...)
}

Golang Determining whether *File points to file or directory

Is there a way to determine whether my *File is pointing to a file or a directory?
fileOrDir, err := os.Open(name)
// How do I know whether I have a file or directory?
I want to be able to read stats about the file if it is just a file, and be able to read the files within the directory if it is a directory
fileOrDir.Readdirnames(0) // If dir
os.Stat(name) // If file
For example,
package main
import (
"fmt"
"os"
)
func main() {
name := "FileOrDir"
fi, err := os.Stat(name)
if err != nil {
fmt.Println(err)
return
}
switch mode := fi.Mode(); {
case mode.IsDir():
// do directory stuff
fmt.Println("directory")
case mode.IsRegular():
// do file stuff
fmt.Println("file")
}
}
Note:
The example is for Go 1.1. For Go 1.0, replace case mode.IsRegular(): with case mode&os.ModeType == 0:.
Here is another possibility:
import "os"
func IsDirectory(path string) (bool, error) {
fileInfo, err := os.Stat(path)
if err != nil{
return false, err
}
return fileInfo.IsDir(), err
}
Here is how to do the test in one line:
if info, err := os.Stat(path); err == nil && info.IsDir() {
...
}
fileOrDir, err := os.Open(name)
if err != nil {
....
}
info, err := fileOrDir.Stat()
if err != nil {
....
}
if info.IsDir() {
....
} else {
...
}
Be careful to not open and stat the file by name. This will produce a race condition with potential security implications.
If your open succeeds then your have a valid file handle and you should use the Stat() method on it to obtain the stat. The top answer is risky because they suggest to call os.Stat() first and then presumably os.Open() but someone could change the file in between the two calls.
import "os"
// FileExists reports whether the named file exists as a boolean
func FileExists(name string) bool {
if fi, err := os.Stat(name); err == nil {
if fi.Mode().IsRegular() {
return true
}
}
return false
}
// DirExists reports whether the dir exists as a boolean
func DirExists(name string) bool {
if fi, err := os.Stat(name); err == nil {
if fi.Mode().IsDir() {
return true
}
}
return false
}

Resources