The blobstore API has no function to list all blobs. How can I get this list and then delete all blobs?
The blobstore API on appengine for go has no way to do this. Instead, use the datastore to fetch __BlobInfo__ entities as appengine.BlobInfo. Although the API claims to have a BlobKey field, it is not populated. Instead, use the string ID of the returned key and cast it to an appengine.BlobKey, which you can then pass to blobstore.Delete.
Here's a handler at "/tasks/delete-blobs" to delete 20k blobs at a time in a loop until they are all deleted. Also note that cursors are not used here. I suspect that __BlobInfo__ is special and doesn't support cursors. (When I attempted to use them, they did nothing.)
func DeleteBlobs(w http.ResponseWriter, r *http.Request) {
c := appengine.NewContext(r)
c = appengine.Timeout(c, time.Minute)
q := datastore.NewQuery("__BlobInfo__").KeysOnly()
it := q.Run(ctx)
wg := sync.WaitGroup{}
something := false
for _i := 0; _i < 20; _i++ {
var bk []appengine.BlobKey
for i := 0; i < 1000; i++ {
k, err := it.Next(nil)
if err == datastore.Done {
break
} else if err != nil {
c.Errorf("err: %v", err)
continue
}
bk = append(bk, appengine.BlobKey(k.StringID()))
}
if len(bk) == 0 {
break
}
go func(bk []appengine.BlobKey) {
something = true
c.Errorf("deleteing %v blobs", len(bk))
err := blobstore.DeleteMulti(ctx, bk)
if err != nil {
c.Errorf("blobstore delete err: %v", err)
}
wg.Done()
}(bk)
wg.Add(1)
}
wg.Wait()
if something {
taskqueue.Add(c, taskqueue.NewPOSTTask("/tasks/delete-blobs", nil), "")
}
}
Related
I want to create an abstract function, that gets data from DB and fills array by this data. Types of array can be different. And I want to do it without reflect, due to performance issues.
I just want to call everywhere some function like GetDBItems() and get array of data from DB with desired type. But all implementations that I create are owful.
Here is this function implementation:
type AbstractArrayGetter func(size int) []interface{}
func GetItems(arrayGetter AbstractArrayGetter) {
res := DBResponse{}
DB.Get(&res)
arr := arrayGetter(len(res.Rows))
for i := 0; i < len(res.Rows); i++ {
json.Unmarshal(res.Rows[i].Value, &obj[i])
}
}
Here I call this function:
var events []Event
GetFullItems("events", "events_list", map[string]interface{}{}, func(size int) []interface{} {
events = make([]Event, size, size)
proxyEnt := make([]interface{}, size, size)
for i, _ := range events {
proxyEnt[i] = &events[i]
}
return proxyEnt
})
It works, but there are to much code to call this function, also there is some perfomance issue about copying events array to interfaces array.
How can I do it without reflect and do it with a short function call code? Or reflect not to slow in this case?
I tested performance with reflect, and it is similar to the mentioned above solution. So here is solution with reflect, if someone needs it. This function gets data from DB and fills abstract array
func GetItems(design string, viewName string, opts map[string]interface{}, arrayType interface{}) (interface{}, error) {
res := couchResponse{}
opts["limit"] = 100000
bytes, err := CouchView(design, viewName, opts)
if err != nil {
return nil, err
}
err = json.Unmarshal(bytes, &res)
if err != nil {
return nil, err
}
dataType := reflect.TypeOf(arrayType)
slice := reflect.MakeSlice(reflect.SliceOf(dataType), len(res.Rows), len(res.Rows))
for i := 0; i < len(res.Rows); i++ {
if opts["include_docs"] == true {
err = json.Unmarshal(res.Rows[i].Doc, slice.Index(i).Addr().Interface())
} else {
err = json.Unmarshal(res.Rows[i].Value, slice.Index(i).Addr().Interface())
}
if err != nil {
return nil, err
}
}
x := reflect.New(slice.Type())
x.Elem().Set(slice)
return x.Interface(), nil
}
and getting data using this function:
var e Event
res, err := GetItems("data", "data_list", map[string]interface{}{}, e)
I am trying to update a lot of records, to which cannot be done within the one minute max request time given, so I need to use a datastore.Cursor, but for some reason the returned cursor is always the same. So each redirect is done with the same cursor value, resulting in the the same 20 database updates being performed each time.
Any ideas to why things aren't working like I would like?
http.HandleFunc("/fix", func(w, http.ResponseWriter, r *http.Request) {
c := appengine.NewContext(r)
fixUser(c, w, r, "/fix", func() error {
// do the fix here
return nil
})
})
func fixUser(ctx context.Context, w http.ResponseWriter, r *http.Request, path string, fn func(user *User) error) {
q := datastore.NewQuery("users")
c := r.URL.Query().Get("c")
if len(c) > 0 {
cursor, err := datastore.DecodeCursor(c)
if err != nil {
w.WriteHeader(http.StatusInternalServerError)
w.Write([]byte(err.Error()))
return
}
q.Start(cursor)
}
iter := q.Run(ctx)
var cr datastore.Cursor
for i := 0; i < 20; i++ {
var u User
key, err := iter.Next(&u)
if err == datastore.Done {
return
}
if err != nil {
panic(err.Error())
}
cr, _ = iter.Cursor()
log.Debugf(ctx, "Cursor: %v", cr) // always the same value
u.Key = key
fn(&u)
}
pathWithCursor := fmt.Sprintf("%s?c=%s", path, cr.String())
http.Redirect(w, r, pathWithCursor, 301)
}
I looked at some of my own cursor code and compared it against yours. The main difference I see is that I use q = q.Start(cursor) rather than q.start(cursor). This should fix your problem since your query will now be updated to reflect the position specified by the cursor. Without storing your query back into the q variable, your query will not update.
I had this convenient function in Python:
def follow(path):
with open(self.path) as lines:
lines.seek(0, 2) # seek to EOF
while True:
line = lines.readline()
if not line:
time.sleep(0.1)
continue
yield line
It does something similar to UNIX tail -f: you get last lines of a file as they come. It's convenient because you can get the generator without blocking and pass it to another function.
Then I had to do the same thing in Go. I'm new to this language, so I'm not sure whether what I did is idiomatic/correct enough for Go.
Here is the code:
func Follow(fileName string) chan string {
out_chan := make(chan string)
file, err := os.Open(fileName)
if err != nil {
log.Fatal(err)
}
file.Seek(0, os.SEEK_END)
bf := bufio.NewReader(file)
go func() {
for {
line, _, _ := bf.ReadLine()
if len(line) == 0 {
time.Sleep(10 * time.Millisecond)
} else {
out_chan <- string(line)
}
}
defer file.Close()
close(out_chan)
}()
return out_chan
}
Is there any cleaner way to do this in Go? I have a feeling that using an asynchronous call for such a thing is an overkill, and it really bothers me.
Create a wrapper around a reader that sleeps on EOF:
type tailReader struct {
io.ReadCloser
}
func (t tailReader) Read(b []byte) (int, error) {
for {
n, err := t.ReadCloser.Read(b)
if n > 0 {
return n, nil
} else if err != io.EOF {
return n, err
}
time.Sleep(10 * time.Millisecond)
}
}
func newTailReader(fileName string) (tailReader, error) {
f, err := os.Open(fileName)
if err != nil {
return tailReader{}, err
}
if _, err := f.Seek(0, 2); err != nil {
return tailReader{}, err
}
return tailReader{f}, nil
}
This reader can be used anywhere an io.Reader can be used. Here's how loop over lines using bufio.Scanner:
t, err := newTailReader("somefile")
if err != nil {
log.Fatal(err)
}
defer t.Close()
scanner := bufio.NewScanner(t)
for scanner.Scan() {
fmt.Println(scanner.Text())
}
if err := scanner.Err(); err != nil {
fmt.Fprintln(os.Stderr, "reading:", err)
}
The reader can also be used to loop over JSON values appended to the file:
t, err := newTailReader("somefile")
if err != nil {
log.Fatal(err)
}
defer t.Close()
dec := json.NewDecoder(t)
for {
var v SomeType
if err := dec.Decode(&v); err != nil {
log.Fatal(err)
}
fmt.Println("the value is ", v)
}
There are a couple of advantages this approach has over the goroutine approach outlined in the question. The first is that shutdown is easy. Just close the file. There's no need to signal the goroutine that it should exit. The second advantage is that many packages work with io.Reader.
The sleep time can be adjusted up or down to meet specific needs. Decrease the time for lower latency and increase the time to reduce CPU use. A sleep of 100ms is probably fast enough for data that's displayed to humans.
Check out this Go package for reading from continuously updated files (tail -f): https://github.com/hpcloud/tail
t, err := tail.TailFile("filename", tail.Config{Follow: true})
for line := range t.Lines {
fmt.Println(line.Text)
}
I'm using go and google task queue in order to create some a sync jobs.
I'm passing the data to the worker method successfully but i can't unmarshal the data in order to use it.
I tried different ways i'm getting an unmarshal error
err um &json.SyntaxError{msg:"invalid character 'i' in literal false (expecting 'a')", Offset:2}
This is how i'm sending the data to the queue
keys := make(map[string][]string)
keys["filenames"] = req.FileNames // []string
t := taskqueue.NewPOSTTask("/deletetask", keys)
_, err = taskqueue.Add(ctx, t, "delete")
And this is how i tried to unmarshal it
type Files struct {
fileNames string `json:"filenames"`
}
func worker(w http.ResponseWriter, r *http.Request) {
c := appengine.NewContext(r)
var objmap map[string]json.RawMessage
b, err := ioutil.ReadAll(r.Body)
if err != nil {
c.Debugf("err io %#v", err)
}
c.Debugf("b %#v", string(b[:])) //Print: b "filenames=1.txt&filenames=2.txt"
err = json.Unmarshal(b, &objmap)
if err != nil {
c.Debugf("err um %#v", err)
}
//this didn't work as well same err
f := []Files{}
err = json.Unmarshal(b, &f)
}
Arguments to tasks are sent as POST-values, you assign a slice of strings as the POST-value for the key filenames.
Then you try to deserialize the full POST request body as JSON.
A simple solution would be to split up each file in one task and just send the file name as a string value, then it would be something like:
// Add tasks
for i := range req.FileNames {
postValues := url.Values{}
postValues.Set("fileName", req.FileNames[i])
t := taskqueue.NewPOSTTask("/deletetask", postValues)
if _, err := taskqueue.Add(ctx, t, "delete"); err != nil {
c.Errorf("Failed add task, error: %v, fileName: %v", err, req.FileNames[i])
}
}
// The actual delete worker
func worker(w http.ResponseWriter, r *http.Request) {
ctx := appengine.NewContext(r)
fileName := r.FormValue("fileName")
ctx.Infof("Deleting: %v", fileName)
}
I am writing a basic program to read values from database table and print in table. The table was populated by an ancient program. Some of the fields in the row are optional and when I try to read them as string, I get the following error:
panic: sql: Scan error on column index 2: unsupported driver -> Scan pair: <nil> -> *string
After I read other questions for similar issues, I came up with following code to handle the nil values. The method works fine in practice. I get the values in plain text and empty string instead of the nil values.
However, I have two concerns:
This does not look efficient. I need to handle 25+ fields like this and that would mean I read each of them as bytes and convert to string. Too many function calls and conversions. Two structs to handle the data and so on...
The code looks ugly. It is already looking convoluted with 2 fields and becomes unreadable as I go to 25+
Am I doing it wrong? Is there a better/cleaner/efficient/idiomatic golang way to read values from database?
I find it hard to believe that a modern language like Go would not handle the database returns gracefully.
Thanks in advance!
Code snippet:
// DB read format
type udInfoBytes struct {
id []byte
state []byte
}
// output format
type udInfo struct {
id string
state string
}
func CToGoString(c []byte) string {
n := -1
for i, b := range c {
if b == 0 {
break
}
n = i
}
return string(c[:n+1])
}
func dbBytesToString(in udInfoBytes) udInfo {
var out udInfo
var s string
var t int
out.id = CToGoString(in.id)
out.state = stateName(in.state)
return out
}
func GetInfo(ud string) udInfo {
db := getFileHandle()
q := fmt.Sprintf("SELECT id,state FROM Mytable WHERE id='%s' ", ud)
rows, err := db.Query(q)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
ret := udInfo{}
r := udInfoBytes{}
for rows.Next() {
err := rows.Scan(&r.id, &r.state)
if err != nil {
log.Println(err)
}
break
}
err = rows.Err()
if err != nil {
log.Fatal(err)
}
ret = dbBytesToString(r)
defer db.Close()
return ret
}
edit:
I want to have something like the following where I do no have to worry about handling NULL and automatically read them as empty string.
// output format
type udInfo struct {
id string
state string
}
func GetInfo(ud string) udInfo {
db := getFileHandle()
q := fmt.Sprintf("SELECT id,state FROM Mytable WHERE id='%s' ", ud)
rows, err := db.Query(q)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
r := udInfo{}
for rows.Next() {
err := rows.Scan(&r.id, &r.state)
if err != nil {
log.Println(err)
}
break
}
err = rows.Err()
if err != nil {
log.Fatal(err)
}
defer db.Close()
return r
}
There are separate types to handle null values coming from the database such as sql.NullBool, sql.NullFloat64, etc.
For example:
var s sql.NullString
err := db.QueryRow("SELECT name FROM foo WHERE id=?", id).Scan(&s)
...
if s.Valid {
// use s.String
} else {
// NULL value
}
go's database/sql package handle pointer of the type.
package main
import (
"database/sql"
"fmt"
_ "github.com/mattn/go-sqlite3"
"log"
)
func main() {
db, err := sql.Open("sqlite3", ":memory:")
if err != nil {
log.Fatal(err)
}
defer db.Close()
_, err = db.Exec("create table foo(id integer primary key, value text)")
if err != nil {
log.Fatal(err)
}
_, err = db.Exec("insert into foo(value) values(null)")
if err != nil {
log.Fatal(err)
}
_, err = db.Exec("insert into foo(value) values('bar')")
if err != nil {
log.Fatal(err)
}
rows, err := db.Query("select id, value from foo")
if err != nil {
log.Fatal(err)
}
for rows.Next() {
var id int
var value *string
err = rows.Scan(&id, &value)
if err != nil {
log.Fatal(err)
}
if value != nil {
fmt.Println(id, *value)
} else {
fmt.Println(id, value)
}
}
}
You should get like below:
1 <nil>
2 bar
An alternative solution would be to handle this in the SQL statement itself by using the COALESCE function (though not all DB's may support this).
For example you could instead use:
q := fmt.Sprintf("SELECT id,COALESCE(state, '') as state FROM Mytable WHERE id='%s' ", ud)
which would effectively give 'state' a default value of an empty string in the event that it was stored as a NULL in the db.
Two ways to handle those nulls:
Using sql.NullString
if value.Valid {
return value.String
}
Using *string
if value != nil {
return *value
}
https://medium.com/#raymondhartoyo/one-simple-way-to-handle-null-database-value-in-golang-86437ec75089
I've started to use the MyMySql driver as it uses a nicer interface to that of the std library.
https://github.com/ziutek/mymysql
I've then wrapped the querying of the database into simple to use functions. This is one such function:
import "github.com/ziutek/mymysql/mysql"
import _ "github.com/ziutek/mymysql/native"
// Execute a prepared statement expecting multiple results.
func Query(sql string, params ...interface{}) (rows []mysql.Row, err error) {
statement, err := db.Prepare(sql)
if err != nil {
return
}
result, err := statement.Run(params...)
if err != nil {
return
}
rows, err = result.GetRows()
return
}
To use this is as simple as this snippet:
rows, err := Query("SELECT * FROM table WHERE column = ?", param)
for _, row := range rows {
column1 = row.Str(0)
column2 = row.Int(1)
column3 = row.Bool(2)
column4 = row.Date(3)
// etc...
}
Notice the nice row methods for coercing to a particular value. Nulls are handled by the library and the rules are documented here:
https://github.com/ziutek/mymysql/blob/master/mysql/row.go