ValueError: Face could not be detected in DeepFace - face-detection

I am implementing a code segment to detect video frames with faces and store them in an array. For this purpose I am using DeepFace library. (Go to deepface github repository).
Below is my code segment:
# Import Libraries
from deepface import DeepFace
import matplotlib.pyplot as plt
import cv2
# Path of the video
video_file_path = '/content/drive/My Drive/Colab Notebooks/FYP Project/Data Preprocessing/youtube_clip_001.mp4'
# Reading the video
vidcap = cv2.VideoCapture(video_file_path)
# Extracting the frames
frames = []
while True:
ret, frame = vidcap.read()
if not ret:
break
# Extracting the face from the frame
faces = DeepFace.detectFace(frame)
if len(faces) > 0:
frames.append(frame)
Each and every frame in the video file I am using may not have human faces. That is why I need to extract only the frames with human faces. But it does give the following error:
ValueError: Face could not be detected. Please confirm that the
picture is a face photo or consider to set enforce_detection param to
False.
But when I make faces = DeepFace.detectFace(frame, enforce_detection=False) as suggested in the error, then it add not only the frames with human faces, but also all the frames in the video to the array including frames without faces.
Can somebody please help me to solve this issue?
Here is the link to the video file I am using: https://drive.google.com/file/d/1vAJyjbQYAYFJS4DVN0UWDYb21wf0r0TL/view?usp=sharing

In face detection by deepface, it checks for a face in the frame and if not throws an exception. As mentioned in this GitHub issue also, if I make enforce_detection=False then it reduces the accuracy. Therefore the best option available is to handle the exception without breaking the code.
Therefore you can modify the above code snippet as follows:
while True:
ret, frame = vidcap.read()
if not ret:
break
# Extracting the face from the frame
try:
faces = DeepFace.detectFace(frame)
if len(faces) > 0:
frames.append(frame)
except:
print('exception')
Following is another example I tried with again:

Related

AssertionError: Audio source must be entered before listening, are you using ``source`` outside of a ``with`` statement?

I am getting this error for taking the voice instruction:
File "C:\Users\hp\AppData\Local\Programs\Python\Python310\lib\site-packages\speech_recognition\__init__.py", line 651, in listen
assert source.stream is not None, "Audio source must be entered before listening, see documentation for ``AudioSource``; are you using ``source`` outside of a ``with`` statement?"
AssertionError: Audio source must be entered before listening, see documentation for ``AudioSource``; are you using ``source`` outside of a ``with`` statement?
Here's the code:
import speech_recognition as mic
import pyttsx3
import pywhatkit
import datetime
import wikipedia
assistant = pyttsx3.init()
listener= mic.Recognizer()
def respond(text):
assistant.say(text)
assistant.runAndWait()
def inputInstructions():
with mic.Microphone() as source:
listener.pause_threshold = 1
print(source)
speech = listener.listen(source)
instruction = listener.recognize_google(speech)
instruction = instruction.lower()
if("stark") in instruction:
instruction = instruction.replace('stark',"")
print(instruction)
return instruction
I tried the above code but it is not working

Not able to read Sagemaker Semantic Segmentation Model Batch Transformation Output file

Currently I have deployed a Semantic Segmentation model and an endpoint with which I am able to invoke and get inferences. Now, I am getting inferences for each image at a time.
Now I want to try batch of images at a time using Batch Transform job. It worked perfectly fine but the images that is created is an .out file and I'm not able to open that file using any of the viz library like matplotlib imread, PIL Image and openCV imread. It all says not an image.
Just wanted to understand what is the .out file ? and if it is an segmented mask image which is typically the output of a semantic segmentation model then how can I read that file.
My code for Batch Transformation:
from sagemaker.predictor import RealTimePredictor, csv_serializer, csv_deserializer
class Predictor(RealTimePredictor):
def __init__(self, endpoint_name, sagemaker_session=None):
super(Predictor, self).__init__(
endpoint_name, sagemaker_session, csv_serializer, csv_deserializer
)
ss_model = sagemaker.model.Model(role =role, image=training_image, model_data = model, predictor_cls=Predictor, sagemaker_session=sess)
transformer = ss_model.transformer(instance_count=1, instance_type='ml.c4.xlarge', output_path=batch_output_data)
transformer.transform(data=batch_input_data, data_type='S3Prefix', content_type='image/png', split_type='None')
transformer.wait()
As the official doc say:
The SageMaker semantic segmentation algorithm provides a fine-grained,
pixel-level approach to developing computer vision applications.
It tags every pixel in an image with a class label from a predefined set of classes.
...
Because the semantic segmentation algorithm classifies every pixel in
an image, it also provides information about the shapes of the objects
contained in the image. The segmentation output is represented as a
grayscale image, called a segmentation mask. A segmentation mask is a
grayscale image with the same shape as the input image.
So, what the .out file has are pixel arrays (assigned class for each pixel).
You need to deserialize the file:
from PIL import Image
import numpy as np
import io
from boto3.session import Session
session = Session(
aws_access_key_id=KEY_ID, aws_secret_access_key=ACCESS_KEY, region_name=REGION_NAME
)
s3 = session.resource("s3")
out_file = io.BytesIO()
s3_object = s3.Object(BUCKET, PATH)
s3_object.download_fileobj(out_file)
out_file.seek(0)
mask = np.array(Image.open(out_file))
Also, I found a class ImageDeserializer in this doc to do the job with a stream of data. Maybe you can adapt it to your file because it read a stream of bytes returned from an inference endpoint (batch transform put this data in the .out file).
what is the .out file?
.out file essentially is the stream(content) in your http response HttpEntity(including header and body). It matches the order of the records in your input file.
To be more specific, for each batch transform, SageMaker splits your input file into records, construct multiple http requests and send them to the model server. The inference result is included in the http response. SageMaker upload the inference results to s3 with ".out" suffix.

Why can't I invoke sagemaker endpoint with either bytes or file as payload

I have deployed a linear regression model on Sagemaker. Now I want to write a lambda function to make prediction on input data. Files are pulled from S3 first. Some preprocessing is done and the final input is a pandas dataframe. According to boto3 sagemaker documentation, the payload can either be byte-like, or file. So I have tried to convert the dataframe to a byte array using code from this post
# Convert pandas dataframe to byte array
pred_np = pred_df.to_records(index=False)
pred_str = pred_np.tostring()
# Start sagemaker prediction
sm_runtime = aws_session.client('runtime.sagemaker')
response = sm_runtime.invoke_endpoint(
EndpointName=SAGEMAKER_ENDPOINT,
Body=pred_str,
ContentType='text/csv',
Accept='Accept')
I printed out pred_str which does seem like a byte array to me.
However when I run it, I got the following Algorithm Error caused by UnicodeDecodeError:
Caused by: 'utf8' codec can't decode byte 0xed in position 9: invalid continuation byte
The traceback shows python 2.7 not sure why that is:
Traceback (most recent call last):
File "/opt/amazon/lib/python2.7/site-packages/ai_algorithms_sdk/serve.py", line 465, in invocations
data_iter = get_data_iterator(payload, **content_parameters)
File "/opt/amazon/lib/python2.7/site-packages/ai_algorithms_sdk/io/serve_helpers.py", line 99, in iterator_csv_dense_rank_2
payload = payload.decode("utf8")
File "/opt/amazon/python2.7/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
Is the default decoder utf_8? What is the right decoder I should be using? Why is it complaining about position 9?
In addition, I also tried to save the dataframe to csv file and use that as payload
pred_df.to_csv('pred.csv', index=False)
with open('pred.csv', 'rb') as f:
payload = f.read()
response = sm_runtime.invoke_endpoint(
EndpointName=SAGEMAKER_ENDPOINT,
Body=payload,
ContentType='text/csv',
Accept='Accept')
However when I ran it I got the following error:
Customer Error: Unable to parse payload. Some rows may have more columns than others and/or non-numeric values may be present in the csv data.
And again, the traceback is calling python 2.7:
Traceback (most recent call last):
File "/opt/amazon/lib/python2.7/site-packages/ai_algorithms_sdk/serve.py", line 465, in invocations
data_iter = get_data_iterator(payload, **content_parameters)
File "/opt/amazon/lib/python2.7/site-packages/ai_algorithms_sdk/io/serve_helpers.py", line 123, in iterator_csv_dense_rank_2
It doesn't make sense at all because it is standard 6x78 dataframe. All rows have same number of columns. Plus none of the columns are non-numeric.
How to fix this sagemaker issue?
I was finally able to make it work with the following code:
payload = io.StringIO()
pred_df.to_csv(payload, header=None, index=None)
sm_runtime = aws_session.client('runtime.sagemaker')
response = sm_runtime.invoke_endpoint(
EndpointName=SAGEMAKER_ENDPOINT,
Body=payload.getvalue(),
ContentType='text/csv',
Accept='Accept')
It is very import to call getvalue() function for the payload while invoking the endpoint. Hope this helps

Decoding and decompressing AI9_DataStream within .eps files

Context: I am attempting to automate the inspection of eps files to detect a list of attributes, such as whether the file contains locked layers, embedded bitmap images etc.
So far we have found some of these things can be detected via inspection of the raw eps file data and its accompanying metadata (similar to the information returned by imagemagick.) However it seems that in files created by illustrator 9 and above the vast majority of this information is encoded within the "AI9_DataStream" portion of the file. This data is encoded via ascii85 and compressed. We have found some success in getting at this data by using: https://github.com/huandu/node-ascii85 to decode and nodes zlib library to decompress / unzip. (Our project is written in node / javascript). However it seems that in roughly half of our test cases / files the unzipping portion fails, throwing Z_DATA_ERROR / "incorrect data check".
Our method responsible for trying to decode:
export const decode = eps =>
new Promise((resolve, reject) => {
const lineDelimiters = /\r\n%|\r%|\n%/g;
const internal = eps.match(
/(%AI9_DataStream)([\s\S]*?)(AI9_PrivateDataEnd)/
);
const hasDataStream = internal && internal.length >= 2;
if (!hasDataStream) resolve('');
const encoded = internal[2].replace(lineDelimiters, '');
const decoded = ascii85.decode(encoded);
try {
zlib.unzip(decoded, (err, buffer) => {
// files can crash this process, for now we need to allow it
if (err) resolve('');
else resolve(buffer.toString('utf8'));
});
} catch (err) {
reject(err);
}
});
I am wondering if anyone out there has had any experience with this issue and has some insight into what might be causing this and whether there is an alternative avenue to explore for reliably decoding this data. Information on this topic seems a bit sparse so really anything that could get us going in the right direction would be very much appreciated.
Note: The buffers produced by the ascii85 decoding all have the same 78 9c header which should indicate standard zlib compression (and it does in fact decompress into parsable data about half the time without error)
Apparently we were misreading something about the ascii85 encoding. There is a ~> delimiter at the end of the encoded block that needs to be omitted from the string before decoding and subsequent unzipping.
So instead of:
/(%AI9_DataStream)([\s\S]*?)(AI9_PrivateDataEnd)/
Use:
/(%AI9_DataStream)([\s\S]*?)(~>)/
And you can get to the correct encoded / compressed data. So far this has produced human readable / regexable data in all of our current test cases so unless we are thrown another curve that seems to be the answer.
The only reliable method for getting content from PostScript is to run it through a PostScript interpreter, because PostScript is a programming language.
If you stick to a specific workflow with well understood input, then you may have some success in simple parsing, but that's about the only likely scenario which will work.
Note that EPS files don't have 'layers' and certainly don't have 'locked' layers.
You haven't actually pointed to a working example, but I suspect the content of the AI9_DataStream is not relevant to the EPS. Its probably a means for Illustrator to include its own native file format inside the EPS file, without it affecting a PostScript interpreter. This is how it works with AI-produced PDF files.
This means that when you reopen the EPS file with Adobe Illustrator, it ignores the EPS and uses the embedded native file, which magically grants you the ability to edit the file, including features like layers which cannot be represented in the EPS.

PHPExcel Load error - Cell coordinate must be a range of cells

Good Afternoon All,
I am working on an issue in PHPExcel. Using the following code:
try {
$inputFileType = PHPExcel_IOFactory::identify($fileLocation);
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$objReader->setReadDataOnly(true);
$objPHPExcel = $objReader->load($fileLocation);
} catch(Exception $e) {
die('ERROR LOADING FILE: "'.print_r(pathinfo($fileLocation),true).'": '.$e->getMessage());
} # end try catch
This responses with a this error message:
ERROR LOADING FILE: "Array ( [dirname] => upload [basename] => d10f8...188 [filename] => d10f8....188 ) ": Cell coordinate must be a range of cells.
Which makes no sense since I am not reading the file yet, only loading it. This code has been in place and working without issue for months (Probably 100+ uses), only one file is causing this error. The file is a Office2007 XLSX (Just like all the others), I have converted the file to multiple other formats (xls, xlt, xlsm) but none of copies will load either. I have found nothing of interest in the file that could explain this behavior.
I have not found anything in my logs and am at a loss to understand the error message of 'Cell coordinate must be a range of cells'. I have isolated the code and made sure that this error message is being generated during this try/catch and is not coming from somewhere else.
Any help would be greatly appreciated,
Paul
This error was caused by a print area being defined in one of sheets. I removed all print areas using these instructions (https://support.office.com/en-us/article/Change-or-clear-a-print-area-on-a-worksheet-deed3c1f-d2ca-4b78-b28d-9c17f0b5de34#bmclearprintarea) and then reran the upload and everything worked. Thanks to MarkBaker for his assistance.
Paul

Resources