* Rename landmark 5 variables

* Mark as NEXT

* Render tabs for multiple ui layout usage

* Allow many face detectors at once, Add face detector tweaks

* Remove face detector tweaks for now (kinda placebo)

* Fix lint issues

* Allow rendering the landmark-5 and landmark-5/68 via debugger

* Fix naming

* Convert face landmark based on confidence score

* Convert face landmark based on confidence score

* Add scrfd face detector model (#397)

* Add scrfd face detector model

* Switch to scrfd_2.5g.onnx model

* Just some renaming

* Downgrade OpenCV, Add SYSTEM_VERSION_COMPAT=0 for MacOS

* Improve naming

* prepare detect frame outside of semaphore

* Feat/process manager (#399)

* Minor naming

* Introduce process manager to start and stop

* Introduce process manager to start and stop

* Introduce process manager to start and stop

* Introduce process manager to start and stop

* Introduce process manager to start and stop

* Remove useless test for now

* Avoid useless variables

* Show stop once is_processing is True

* Allow to stop ffmpeg processing too

* Implement output image resolution (#403)

* Implement output image resolution

* Reorder code

* Simplify output logic and therefore fix bug

* Frame-enhancer-onnx (#404)

* changes

* changes

* changes

* changes

* add models

* update workflow

* Some cleanup

* Some cleanup

* Feat/frame enhancer polishing (#410)

* Some cleanup

* Polish the frame enhancer

* Frame Enhancer: Add more models, optimize processing

* Minor changes

* Improve readability of create_tile_frames and merge_tile_frames

* We don't have enough models yet

* Feat/face landmarker score (#413)

* Introduce face landmarker score

* Fix testing

* Fix testing

* Use release for score related sliders

* Reduce face landmark fallbacks

* Scores and landmarks in Face dict, Change color-theme in face debugger

* Scores and landmarks in Face dict, Change color-theme in face debugger

* Fix some naming

* Add 8K support (for whatever reasons)

* Fix testing

* Using get() for face.landmarks

* Introduce statistics

* More statistics

* Limit the histogram equalization

* Enable queue() for default layout

* Improve copy_image()

* Fix error when switching detector model

* Always set UI values with globals if possible

* Use different logic for output image and output video resolutions

* Enforce re-download if file size is off

* Remove unused method

* Remove unused method

* Remove unused warning filter

* Improved output path normalization (#419)

* Handle some exceptions

* Handle some exceptions

* Cleanup

* Prevent countless thread locks

* Listen to user feedback

* Fix webp edge case

* Feat/cuda device detection (#424)

* Introduce cuda device detection

* Introduce cuda device detection

* it's gtx

* Move logic to run_nvidia_smi()

* Finalize execution device naming

* Finalize execution device naming

* Merge execution_helper.py to execution.py

* Undo lowercase of values

* Undo lowercase of values

* Finalize naming

* Add missing entry to ini

* fix lip_syncer preview (#426)

* fix lip_syncer preview

* change

* Refresh preview on trim changes

* Cleanup frame enhancers and remove useless scale in merge_video() (#428)

* Keep lips over the whole video once lip syncer is enabled (#430)

* Keep lips over the whole video once lip syncer is enabled

* changes

* changes

* Fix spacing

* Use empty audio frame on silence

* Use empty audio frame on silence

* Fix ConfigParser encoding (#431)

facefusion.ini is UTF8 encoded but config.py doesn't specify encoding which results in corrupted entries when non english characters are used. 

Affected entries:
source_paths
target_path
output_path

* Adjust spacing

* Improve the GTX 16 series detection

* Use general exception to catch ParseError

* Use general exception to catch ParseError

* Host frame enhancer models4

* Use latest onnxruntime

* Minor changes in benchmark UI

* Different approach to cancel ffmpeg process

* Add support for amd amf encoders (#433)

* Add amd_amf encoders

* remove -rc cqp from amf encoder parameters

* Improve terminal output, move success messages to debug mode

* Improve terminal output, move success messages to debug mode

* Minor update

* Minor update

* onnxruntime 1.17.1 matches cuda 12.2

* Feat/improved scaling (#435)

* Prevent useless temp upscaling, Show resolution and fps in terminal output

* Remove temp frame quality

* Remove temp frame quality

* Tiny cleanup

* Default back to png for temp frames, Remove pix_fmt from frame extraction due mjpeg error

* Fix inswapper fallback by onnxruntime

* Fix inswapper fallback by major onnxruntime

* Fix inswapper fallback by major onnxruntime

* Add testing for vision restrict methods

* Fix left / right face mask regions, add left-ear and right-ear

* Flip right and left again

* Undo ears - does not work with box mask

* Prepare next release

* Fix spacing

* 100% quality when using jpg for temp frames

* Use span_kendata_x4 as default as of speed

* benchmark optimal tile and pad

* Undo commented out code

* Add real_esrgan_x4_fp16 model

* Be strict when using many face detectors

---------

Co-authored-by: Harisreedhar <46858047+harisreedhar@users.noreply.github.com>
Co-authored-by: aldemoth <159712934+aldemoth@users.noreply.github.com>
This commit is contained in:
Henry Ruhs 2024-03-14 19:56:54 +01:00 committed by GitHub
parent dd2193cf39
commit 7609df6747
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
60 changed files with 1322 additions and 624 deletions

BIN
.github/preview.png vendored

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.2 MiB

After

Width:  |  Height:  |  Size: 1.2 MiB

View File

@ -30,6 +30,6 @@ jobs:
uses: actions/setup-python@v2 uses: actions/setup-python@v2
with: with:
python-version: '3.10' python-version: '3.10'
- run: python install.py --torch cpu --onnxruntime default --skip-venv - run: python install.py --onnxruntime default --skip-venv
- run: pip install pytest - run: pip install pytest
- run: pytest - run: pytest

View File

@ -54,12 +54,13 @@ face analyser:
--face-analyser-order {left-right,right-left,top-bottom,bottom-top,small-large,large-small,best-worst,worst-best} specify the order in which the face analyser detects faces. --face-analyser-order {left-right,right-left,top-bottom,bottom-top,small-large,large-small,best-worst,worst-best} specify the order in which the face analyser detects faces.
--face-analyser-age {child,teen,adult,senior} filter the detected faces based on their age --face-analyser-age {child,teen,adult,senior} filter the detected faces based on their age
--face-analyser-gender {female,male} filter the detected faces based on their gender --face-analyser-gender {female,male} filter the detected faces based on their gender
--face-detector-model {retinaface,yoloface,yunet} choose the model responsible for detecting the face --face-detector-model {many,retinaface,scrfd,yoloface,yunet} choose the model responsible for detecting the face
--face-detector-size FACE_DETECTOR_SIZE specify the size of the frame provided to the face detector --face-detector-size FACE_DETECTOR_SIZE specify the size of the frame provided to the face detector
--face-detector-score [0.0-1.0] filter the detected faces base on the confidence score --face-detector-score [0.0-1.0] filter the detected faces base on the confidence score
--face-landmarker-score [0.0-1.0] filter the detected landmarks base on the confidence score
face selector: face selector:
--face-selector-mode {reference,one,many} use reference based tracking with simple matching --face-selector-mode {many,one,reference} use reference based tracking or simple matching
--reference-face-position REFERENCE_FACE_POSITION specify the position used to create the reference face --reference-face-position REFERENCE_FACE_POSITION specify the position used to create the reference face
--reference-face-distance [0.0-1.5] specify the desired similarity between the reference face and target face --reference-face-distance [0.0-1.5] specify the desired similarity between the reference face and target face
--reference-frame-number REFERENCE_FRAME_NUMBER specify the frame used to create the reference face --reference-frame-number REFERENCE_FRAME_NUMBER specify the frame used to create the reference face
@ -74,12 +75,12 @@ frame extraction:
--trim-frame-start TRIM_FRAME_START specify the the start frame of the target video --trim-frame-start TRIM_FRAME_START specify the the start frame of the target video
--trim-frame-end TRIM_FRAME_END specify the the end frame of the target video --trim-frame-end TRIM_FRAME_END specify the the end frame of the target video
--temp-frame-format {bmp,jpg,png} specify the temporary resources format --temp-frame-format {bmp,jpg,png} specify the temporary resources format
--temp-frame-quality [0-100] specify the temporary resources quality
--keep-temp keep the temporary resources after processing --keep-temp keep the temporary resources after processing
output creation: output creation:
--output-image-quality [0-100] specify the image quality which translates to the compression factor --output-image-quality [0-100] specify the image quality which translates to the compression factor
--output-video-encoder {libx264,libx265,libvpx-vp9,h264_nvenc,hevc_nvenc} specify the encoder use for the video compression --output-image-resolution OUTPUT_IMAGE_RESOLUTION specify the image output resolution based on the target image
--output-video-encoder {libx264,libx265,libvpx-vp9,h264_nvenc,hevc_nvenc,h264_amf,hevc_amf} specify the encoder use for the video compression
--output-video-preset {ultrafast,superfast,veryfast,faster,fast,medium,slow,slower,veryslow} balance fast video processing and video file size --output-video-preset {ultrafast,superfast,veryfast,faster,fast,medium,slow,slower,veryslow} balance fast video processing and video file size
--output-video-quality [0-100] specify the video quality which translates to the compression factor --output-video-quality [0-100] specify the video quality which translates to the compression factor
--output-video-resolution OUTPUT_VIDEO_RESOLUTION specify the video output resolution based on the target video --output-video-resolution OUTPUT_VIDEO_RESOLUTION specify the video output resolution based on the target video
@ -88,11 +89,11 @@ output creation:
frame processors: frame processors:
--frame-processors FRAME_PROCESSORS [FRAME_PROCESSORS ...] load a single or multiple frame processors. (choices: face_debugger, face_enhancer, face_swapper, frame_enhancer, lip_syncer, ...) --frame-processors FRAME_PROCESSORS [FRAME_PROCESSORS ...] load a single or multiple frame processors. (choices: face_debugger, face_enhancer, face_swapper, frame_enhancer, lip_syncer, ...)
--face-debugger-items FACE_DEBUGGER_ITEMS [FACE_DEBUGGER_ITEMS ...] load a single or multiple frame processors (choices: bounding-box, landmark-5, landmark-68, face-mask, score, age, gender) --face-debugger-items FACE_DEBUGGER_ITEMS [FACE_DEBUGGER_ITEMS ...] load a single or multiple frame processors (choices: bounding-box, face-landmark-5, face-landmark-5/68, face-landmark-68, face-mask, face-detector-score, face-landmarker-score, age, gender)
--face-enhancer-model {codeformer,gfpgan_1.2,gfpgan_1.3,gfpgan_1.4,gpen_bfr_256,gpen_bfr_512,restoreformer_plus_plus} choose the model responsible for enhancing the face --face-enhancer-model {codeformer,gfpgan_1.2,gfpgan_1.3,gfpgan_1.4,gpen_bfr_256,gpen_bfr_512,restoreformer_plus_plus} choose the model responsible for enhancing the face
--face-enhancer-blend [0-100] blend the enhanced into the previous face --face-enhancer-blend [0-100] blend the enhanced into the previous face
--face-swapper-model {blendswap_256,inswapper_128,inswapper_128_fp16,simswap_256,simswap_512_unofficial,uniface_256} choose the model responsible for swapping the face --face-swapper-model {blendswap_256,inswapper_128,inswapper_128_fp16,simswap_256,simswap_512_unofficial,uniface_256} choose the model responsible for swapping the face
--frame-enhancer-model {real_esrgan_x2plus,real_esrgan_x4plus,real_esrnet_x4plus} choose the model responsible for enhancing the frame --frame-enhancer-model {lsdir_x4,nomos8k_sc_x4,real_esrgan_x4,span_kendata_x4} choose the model responsible for enhancing the frame
--frame-enhancer-blend [0-100] blend the enhanced into the previous frame --frame-enhancer-blend [0-100] blend the enhanced into the previous frame
--lip-syncer-model {wav2lip_gan} choose the model responsible for syncing the lips --lip-syncer-model {wav2lip_gan} choose the model responsible for syncing the lips

View File

@ -24,6 +24,7 @@ face_analyser_gender =
face_detector_model = face_detector_model =
face_detector_size = face_detector_size =
face_detector_score = face_detector_score =
face_landmarker_score =
[face_selector] [face_selector]
face_selector_mode = face_selector_mode =
@ -41,11 +42,11 @@ face_mask_regions =
trim_frame_start = trim_frame_start =
trim_frame_end = trim_frame_end =
temp_frame_format = temp_frame_format =
temp_frame_quality =
keep_temp = keep_temp =
[output_creation] [output_creation]
output_image_quality = output_image_quality =
output_image_resolution =
output_video_encoder = output_video_encoder =
output_video_preset = output_video_preset =
output_video_quality = output_video_quality =

View File

@ -11,11 +11,16 @@ from facefusion.typing import Fps, Audio, Spectrogram, AudioFrame
def get_audio_frame(audio_path : str, fps : Fps, frame_number : int = 0) -> Optional[AudioFrame]: def get_audio_frame(audio_path : str, fps : Fps, frame_number : int = 0) -> Optional[AudioFrame]:
if is_audio(audio_path): if is_audio(audio_path):
audio_frames = read_static_audio(audio_path, fps) audio_frames = read_static_audio(audio_path, fps)
if frame_number < len(audio_frames): if frame_number in range(len(audio_frames)):
return audio_frames[frame_number] return audio_frames[frame_number]
return None return None
def create_empty_audio_frame() -> AudioFrame:
audio_frame = numpy.zeros((80, 16), dtype = numpy.int16)
return audio_frame
@lru_cache(maxsize = None) @lru_cache(maxsize = None)
def read_static_audio(audio_path : str, fps : Fps) -> Optional[List[AudioFrame]]: def read_static_audio(audio_path : str, fps : Fps) -> Optional[List[AudioFrame]]:
if is_audio(audio_path): if is_audio(audio_path):

View File

@ -9,26 +9,29 @@ face_analyser_ages : List[FaceAnalyserAge] = [ 'child', 'teen', 'adult', 'senior
face_analyser_genders : List[FaceAnalyserGender] = [ 'female', 'male' ] face_analyser_genders : List[FaceAnalyserGender] = [ 'female', 'male' ]
face_detector_set : Dict[FaceDetectorModel, List[str]] =\ face_detector_set : Dict[FaceDetectorModel, List[str]] =\
{ {
'many': [ '640x640' ],
'retinaface': [ '160x160', '320x320', '480x480', '512x512', '640x640' ], 'retinaface': [ '160x160', '320x320', '480x480', '512x512', '640x640' ],
'scrfd': [ '160x160', '320x320', '480x480', '512x512', '640x640' ],
'yoloface': [ '640x640' ], 'yoloface': [ '640x640' ],
'yunet': [ '160x160', '320x320', '480x480', '512x512', '640x640', '768x768', '960x960', '1024x1024' ] 'yunet': [ '160x160', '320x320', '480x480', '512x512', '640x640', '768x768', '960x960', '1024x1024' ]
} }
face_selector_modes : List[FaceSelectorMode] = [ 'reference', 'one', 'many' ] face_selector_modes : List[FaceSelectorMode] = [ 'many', 'one', 'reference' ]
face_mask_types : List[FaceMaskType] = [ 'box', 'occlusion', 'region' ] face_mask_types : List[FaceMaskType] = [ 'box', 'occlusion', 'region' ]
face_mask_regions : List[FaceMaskRegion] = [ 'skin', 'left-eyebrow', 'right-eyebrow', 'left-eye', 'right-eye', 'eye-glasses', 'nose', 'mouth', 'upper-lip', 'lower-lip' ] face_mask_regions : List[FaceMaskRegion] = [ 'skin', 'left-eyebrow', 'right-eyebrow', 'left-eye', 'right-eye', 'eye-glasses', 'nose', 'mouth', 'upper-lip', 'lower-lip' ]
temp_frame_formats : List[TempFrameFormat] = [ 'bmp', 'jpg', 'png' ] temp_frame_formats : List[TempFrameFormat] = [ 'bmp', 'jpg', 'png' ]
output_video_encoders : List[OutputVideoEncoder] = [ 'libx264', 'libx265', 'libvpx-vp9', 'h264_nvenc', 'hevc_nvenc' ] output_video_encoders : List[OutputVideoEncoder] = [ 'libx264', 'libx265', 'libvpx-vp9', 'h264_nvenc', 'hevc_nvenc', 'h264_amf', 'hevc_amf' ]
output_video_presets : List[OutputVideoPreset] = [ 'ultrafast', 'superfast', 'veryfast', 'faster', 'fast', 'medium', 'slow', 'slower', 'veryslow' ] output_video_presets : List[OutputVideoPreset] = [ 'ultrafast', 'superfast', 'veryfast', 'faster', 'fast', 'medium', 'slow', 'slower', 'veryslow' ]
video_template_sizes : List[int] = [ 240, 360, 480, 540, 720, 1080, 1440, 2160 ] image_template_sizes : List[float] = [ 0.25, 0.5, 0.75, 1, 1.5, 2, 2.5, 3, 3.5, 4 ]
video_template_sizes : List[int] = [ 240, 360, 480, 540, 720, 1080, 1440, 2160, 4320 ]
execution_thread_count_range : List[int] = create_int_range(1, 128, 1) execution_thread_count_range : List[int] = create_int_range(1, 128, 1)
execution_queue_count_range : List[int] = create_int_range(1, 32, 1) execution_queue_count_range : List[int] = create_int_range(1, 32, 1)
system_memory_limit_range : List[int] = create_int_range(0, 128, 1) system_memory_limit_range : List[int] = create_int_range(0, 128, 1)
face_detector_score_range : List[float] = create_float_range(0.0, 1.0, 0.05) face_detector_score_range : List[float] = create_float_range(0.0, 1.0, 0.05)
face_landmarker_score_range : List[float] = create_float_range(0.0, 1.0, 0.05)
face_mask_blur_range : List[float] = create_float_range(0.0, 1.0, 0.05) face_mask_blur_range : List[float] = create_float_range(0.0, 1.0, 0.05)
face_mask_padding_range : List[int] = create_int_range(0, 100, 1) face_mask_padding_range : List[int] = create_int_range(0, 100, 1)
reference_face_distance_range : List[float] = create_float_range(0.0, 1.5, 0.05) reference_face_distance_range : List[float] = create_float_range(0.0, 1.5, 0.05)
temp_frame_quality_range : List[int] = create_int_range(0, 100, 1)
output_image_quality_range : List[int] = create_int_range(0, 100, 1) output_image_quality_range : List[int] = create_int_range(0, 100, 1)
output_video_quality_range : List[int] = create_int_range(0, 100, 1) output_video_quality_range : List[int] = create_int_range(0, 100, 1)

View File

@ -1,4 +1,4 @@
from typing import List, Any from typing import List, Any, Tuple
import numpy import numpy
@ -16,3 +16,12 @@ def create_float_range(start : float, stop : float, step : float) -> List[float]
def get_first(__list__ : Any) -> Any: def get_first(__list__ : Any) -> Any:
return next(iter(__list__), None) return next(iter(__list__), None)
def extract_major_version(version : str) -> Tuple[int, int]:
versions = version.split('.')
if len(versions) > 1:
return int(versions[0]), int(versions[1])
if len(versions) == 1:
return int(versions[0]), 0
return 0, 0

View File

@ -12,7 +12,7 @@ def get_config() -> ConfigParser:
if CONFIG is None: if CONFIG is None:
config_path = resolve_relative_path('../facefusion.ini') config_path = resolve_relative_path('../facefusion.ini')
CONFIG = ConfigParser() CONFIG = ConfigParser()
CONFIG.read(config_path) CONFIG.read(config_path, encoding = 'utf-8')
return CONFIG return CONFIG

View File

@ -9,7 +9,7 @@ from tqdm import tqdm
import facefusion.globals import facefusion.globals
from facefusion import wording from facefusion import wording
from facefusion.typing import VisionFrame, ModelValue, Fps from facefusion.typing import VisionFrame, ModelValue, Fps
from facefusion.execution_helper import apply_execution_provider_options from facefusion.execution import apply_execution_provider_options
from facefusion.vision import get_video_frame, count_video_frame_total, read_image, detect_video_fps from facefusion.vision import get_video_frame, count_video_frame_total, read_image, detect_video_fps
from facefusion.filesystem import resolve_relative_path from facefusion.filesystem import resolve_relative_path
from facefusion.download import conditional_download from facefusion.download import conditional_download
@ -62,23 +62,23 @@ def analyse_stream(vision_frame : VisionFrame, video_fps : Fps) -> bool:
return False return False
def prepare_frame(vision_frame : VisionFrame) -> VisionFrame:
vision_frame = cv2.resize(vision_frame, (224, 224)).astype(numpy.float32)
vision_frame -= numpy.array([ 104, 117, 123 ]).astype(numpy.float32)
vision_frame = numpy.expand_dims(vision_frame, axis = 0)
return vision_frame
def analyse_frame(vision_frame : VisionFrame) -> bool: def analyse_frame(vision_frame : VisionFrame) -> bool:
content_analyser = get_content_analyser() content_analyser = get_content_analyser()
vision_frame = prepare_frame(vision_frame) vision_frame = prepare_frame(vision_frame)
probability = content_analyser.run(None, probability = content_analyser.run(None,
{ {
'input:0': vision_frame content_analyser.get_inputs()[0].name: vision_frame
})[0][0][1] })[0][0][1]
return probability > PROBABILITY_LIMIT return probability > PROBABILITY_LIMIT
def prepare_frame(vision_frame : VisionFrame) -> VisionFrame:
vision_frame = cv2.resize(vision_frame, (224, 224)).astype(numpy.float32)
vision_frame -= numpy.array([ 104, 117, 123 ]).astype(numpy.float32)
vision_frame = numpy.expand_dims(vision_frame, axis = 0)
return vision_frame
@lru_cache(maxsize = None) @lru_cache(maxsize = None)
def analyse_image(image_path : str) -> bool: def analyse_image(image_path : str) -> bool:
frame = read_image(image_path) frame = read_image(image_path)

View File

@ -4,32 +4,31 @@ os.environ['OMP_NUM_THREADS'] = '1'
import signal import signal
import sys import sys
import time
import warnings import warnings
import shutil import shutil
import numpy import numpy
import onnxruntime import onnxruntime
from time import sleep from time import sleep, time
from argparse import ArgumentParser, HelpFormatter from argparse import ArgumentParser, HelpFormatter
import facefusion.choices import facefusion.choices
import facefusion.globals import facefusion.globals
from facefusion.face_analyser import get_one_face, get_average_face from facefusion.face_analyser import get_one_face, get_average_face
from facefusion.face_store import get_reference_faces, append_reference_face from facefusion.face_store import get_reference_faces, append_reference_face
from facefusion import face_analyser, face_masker, content_analyser, config, metadata, logger, wording from facefusion import face_analyser, face_masker, content_analyser, config, process_manager, metadata, logger, wording
from facefusion.content_analyser import analyse_image, analyse_video from facefusion.content_analyser import analyse_image, analyse_video
from facefusion.processors.frame.core import get_frame_processors_modules, load_frame_processor_module from facefusion.processors.frame.core import get_frame_processors_modules, load_frame_processor_module
from facefusion.common_helper import create_metavar, get_first from facefusion.common_helper import create_metavar, get_first
from facefusion.execution_helper import encode_execution_providers, decode_execution_providers from facefusion.execution import encode_execution_providers, decode_execution_providers
from facefusion.normalizer import normalize_output_path, normalize_padding, normalize_fps from facefusion.normalizer import normalize_output_path, normalize_padding, normalize_fps
from facefusion.memory import limit_system_memory from facefusion.memory import limit_system_memory
from facefusion.statistics import conditional_log_statistics
from facefusion.filesystem import list_directory, get_temp_frame_paths, create_temp, move_temp, clear_temp, is_image, is_video, filter_audio_paths from facefusion.filesystem import list_directory, get_temp_frame_paths, create_temp, move_temp, clear_temp, is_image, is_video, filter_audio_paths
from facefusion.ffmpeg import extract_frames, compress_image, merge_video, restore_audio, replace_audio from facefusion.ffmpeg import extract_frames, merge_video, copy_image, finalize_image, restore_audio, replace_audio
from facefusion.vision import get_video_frame, read_image, read_static_images, pack_resolution, detect_video_resolution, detect_video_fps, create_video_resolutions from facefusion.vision import read_image, read_static_images, detect_image_resolution, restrict_video_fps, create_image_resolutions, get_video_frame, detect_video_resolution, detect_video_fps, restrict_video_resolution, restrict_image_resolution, create_video_resolutions, pack_resolution, unpack_resolution
onnxruntime.set_default_logger_severity(3) onnxruntime.set_default_logger_severity(3)
warnings.filterwarnings('ignore', category = UserWarning, module = 'gradio') warnings.filterwarnings('ignore', category = UserWarning, module = 'gradio')
warnings.filterwarnings('ignore', category = UserWarning, module = 'torchvision')
def cli() -> None: def cli() -> None:
@ -63,6 +62,7 @@ def cli() -> None:
group_face_analyser.add_argument('--face-detector-model', help = wording.get('help.face_detector_model'), default = config.get_str_value('face_analyser.face_detector_model', 'yoloface'), choices = facefusion.choices.face_detector_set.keys()) group_face_analyser.add_argument('--face-detector-model', help = wording.get('help.face_detector_model'), default = config.get_str_value('face_analyser.face_detector_model', 'yoloface'), choices = facefusion.choices.face_detector_set.keys())
group_face_analyser.add_argument('--face-detector-size', help = wording.get('help.face_detector_size'), default = config.get_str_value('face_analyser.face_detector_size', '640x640')) group_face_analyser.add_argument('--face-detector-size', help = wording.get('help.face_detector_size'), default = config.get_str_value('face_analyser.face_detector_size', '640x640'))
group_face_analyser.add_argument('--face-detector-score', help = wording.get('help.face_detector_score'), type = float, default = config.get_float_value('face_analyser.face_detector_score', '0.5'), choices = facefusion.choices.face_detector_score_range, metavar = create_metavar(facefusion.choices.face_detector_score_range)) group_face_analyser.add_argument('--face-detector-score', help = wording.get('help.face_detector_score'), type = float, default = config.get_float_value('face_analyser.face_detector_score', '0.5'), choices = facefusion.choices.face_detector_score_range, metavar = create_metavar(facefusion.choices.face_detector_score_range))
group_face_analyser.add_argument('--face-landmarker-score', help = wording.get('help.face_landmarker_score'), type = float, default = config.get_float_value('face_analyser.face_landmarker_score', '0.5'), choices = facefusion.choices.face_landmarker_score_range, metavar = create_metavar(facefusion.choices.face_landmarker_score_range))
# face selector # face selector
group_face_selector = program.add_argument_group('face selector') group_face_selector = program.add_argument_group('face selector')
group_face_selector.add_argument('--face-selector-mode', help = wording.get('help.face_selector_mode'), default = config.get_str_value('face_selector.face_selector_mode', 'reference'), choices = facefusion.choices.face_selector_modes) group_face_selector.add_argument('--face-selector-mode', help = wording.get('help.face_selector_mode'), default = config.get_str_value('face_selector.face_selector_mode', 'reference'), choices = facefusion.choices.face_selector_modes)
@ -79,12 +79,12 @@ def cli() -> None:
group_frame_extraction = program.add_argument_group('frame extraction') group_frame_extraction = program.add_argument_group('frame extraction')
group_frame_extraction.add_argument('--trim-frame-start', help = wording.get('help.trim_frame_start'), type = int, default = facefusion.config.get_int_value('frame_extraction.trim_frame_start')) group_frame_extraction.add_argument('--trim-frame-start', help = wording.get('help.trim_frame_start'), type = int, default = facefusion.config.get_int_value('frame_extraction.trim_frame_start'))
group_frame_extraction.add_argument('--trim-frame-end', help = wording.get('help.trim_frame_end'), type = int, default = facefusion.config.get_int_value('frame_extraction.trim_frame_end')) group_frame_extraction.add_argument('--trim-frame-end', help = wording.get('help.trim_frame_end'), type = int, default = facefusion.config.get_int_value('frame_extraction.trim_frame_end'))
group_frame_extraction.add_argument('--temp-frame-format', help = wording.get('help.temp_frame_format'), default = config.get_str_value('frame_extraction.temp_frame_format', 'jpg'), choices = facefusion.choices.temp_frame_formats) group_frame_extraction.add_argument('--temp-frame-format', help = wording.get('help.temp_frame_format'), default = config.get_str_value('frame_extraction.temp_frame_format', 'png'), choices = facefusion.choices.temp_frame_formats)
group_frame_extraction.add_argument('--temp-frame-quality', help = wording.get('help.temp_frame_quality'), type = int, default = config.get_int_value('frame_extraction.temp_frame_quality', '100'), choices = facefusion.choices.temp_frame_quality_range, metavar = create_metavar(facefusion.choices.temp_frame_quality_range))
group_frame_extraction.add_argument('--keep-temp', help = wording.get('help.keep_temp'), action = 'store_true', default = config.get_bool_value('frame_extraction.keep_temp')) group_frame_extraction.add_argument('--keep-temp', help = wording.get('help.keep_temp'), action = 'store_true', default = config.get_bool_value('frame_extraction.keep_temp'))
# output creation # output creation
group_output_creation = program.add_argument_group('output creation') group_output_creation = program.add_argument_group('output creation')
group_output_creation.add_argument('--output-image-quality', help = wording.get('help.output_image_quality'), type = int, default = config.get_int_value('output_creation.output_image_quality', '80'), choices = facefusion.choices.output_image_quality_range, metavar = create_metavar(facefusion.choices.output_image_quality_range)) group_output_creation.add_argument('--output-image-quality', help = wording.get('help.output_image_quality'), type = int, default = config.get_int_value('output_creation.output_image_quality', '80'), choices = facefusion.choices.output_image_quality_range, metavar = create_metavar(facefusion.choices.output_image_quality_range))
group_output_creation.add_argument('--output-image-resolution', help = wording.get('help.output_image_resolution'), default = config.get_str_value('output_creation.output_image_resolution'))
group_output_creation.add_argument('--output-video-encoder', help = wording.get('help.output_video_encoder'), default = config.get_str_value('output_creation.output_video_encoder', 'libx264'), choices = facefusion.choices.output_video_encoders) group_output_creation.add_argument('--output-video-encoder', help = wording.get('help.output_video_encoder'), default = config.get_str_value('output_creation.output_video_encoder', 'libx264'), choices = facefusion.choices.output_video_encoders)
group_output_creation.add_argument('--output-video-preset', help = wording.get('help.output_video_preset'), default = config.get_str_value('output_creation.output_video_preset', 'veryfast'), choices = facefusion.choices.output_video_presets) group_output_creation.add_argument('--output-video-preset', help = wording.get('help.output_video_preset'), default = config.get_str_value('output_creation.output_video_preset', 'veryfast'), choices = facefusion.choices.output_video_presets)
group_output_creation.add_argument('--output-video-quality', help = wording.get('help.output_video_quality'), type = int, default = config.get_int_value('output_creation.output_video_quality', '80'), choices = facefusion.choices.output_video_quality_range, metavar = create_metavar(facefusion.choices.output_video_quality_range)) group_output_creation.add_argument('--output-video-quality', help = wording.get('help.output_video_quality'), type = int, default = config.get_int_value('output_creation.output_video_quality', '80'), choices = facefusion.choices.output_video_quality_range, metavar = create_metavar(facefusion.choices.output_video_quality_range))
@ -111,7 +111,7 @@ def apply_args(program : ArgumentParser) -> None:
# general # general
facefusion.globals.source_paths = args.source_paths facefusion.globals.source_paths = args.source_paths
facefusion.globals.target_path = args.target_path facefusion.globals.target_path = args.target_path
facefusion.globals.output_path = normalize_output_path(facefusion.globals.source_paths, facefusion.globals.target_path, args.output_path) facefusion.globals.output_path = args.output_path
# misc # misc
facefusion.globals.skip_download = args.skip_download facefusion.globals.skip_download = args.skip_download
facefusion.globals.headless = args.headless facefusion.globals.headless = args.headless
@ -133,6 +133,7 @@ def apply_args(program : ArgumentParser) -> None:
else: else:
facefusion.globals.face_detector_size = '640x640' facefusion.globals.face_detector_size = '640x640'
facefusion.globals.face_detector_score = args.face_detector_score facefusion.globals.face_detector_score = args.face_detector_score
facefusion.globals.face_landmarker_score = args.face_landmarker_score
# face selector # face selector
facefusion.globals.face_selector_mode = args.face_selector_mode facefusion.globals.face_selector_mode = args.face_selector_mode
facefusion.globals.reference_face_position = args.reference_face_position facefusion.globals.reference_face_position = args.reference_face_position
@ -147,20 +148,26 @@ def apply_args(program : ArgumentParser) -> None:
facefusion.globals.trim_frame_start = args.trim_frame_start facefusion.globals.trim_frame_start = args.trim_frame_start
facefusion.globals.trim_frame_end = args.trim_frame_end facefusion.globals.trim_frame_end = args.trim_frame_end
facefusion.globals.temp_frame_format = args.temp_frame_format facefusion.globals.temp_frame_format = args.temp_frame_format
facefusion.globals.temp_frame_quality = args.temp_frame_quality
facefusion.globals.keep_temp = args.keep_temp facefusion.globals.keep_temp = args.keep_temp
# output creation # output creation
facefusion.globals.output_image_quality = args.output_image_quality facefusion.globals.output_image_quality = args.output_image_quality
if is_image(args.target_path):
output_image_resolution = detect_image_resolution(args.target_path)
output_image_resolutions = create_image_resolutions(output_image_resolution)
if args.output_image_resolution in output_image_resolutions:
facefusion.globals.output_image_resolution = args.output_image_resolution
else:
facefusion.globals.output_image_resolution = pack_resolution(output_image_resolution)
facefusion.globals.output_video_encoder = args.output_video_encoder facefusion.globals.output_video_encoder = args.output_video_encoder
facefusion.globals.output_video_preset = args.output_video_preset facefusion.globals.output_video_preset = args.output_video_preset
facefusion.globals.output_video_quality = args.output_video_quality facefusion.globals.output_video_quality = args.output_video_quality
if is_video(args.target_path): if is_video(args.target_path):
target_video_resolutions = create_video_resolutions(args.target_path) output_video_resolution = detect_video_resolution(args.target_path)
if args.output_video_resolution in target_video_resolutions: output_video_resolutions = create_video_resolutions(output_video_resolution)
if args.output_video_resolution in output_video_resolutions:
facefusion.globals.output_video_resolution = args.output_video_resolution facefusion.globals.output_video_resolution = args.output_video_resolution
else: else:
target_video_resolution = detect_video_resolution(args.target_path) facefusion.globals.output_video_resolution = pack_resolution(output_video_resolution)
facefusion.globals.output_video_resolution = pack_resolution(target_video_resolution)
if args.output_video_fps or is_video(args.target_path): if args.output_video_fps or is_video(args.target_path):
facefusion.globals.output_video_fps = normalize_fps(args.output_video_fps) or detect_video_fps(args.target_path) facefusion.globals.output_video_fps = normalize_fps(args.output_video_fps) or detect_video_fps(args.target_path)
facefusion.globals.skip_audio = args.skip_audio facefusion.globals.skip_audio = args.skip_audio
@ -196,6 +203,9 @@ def run(program : ArgumentParser) -> None:
def destroy() -> None: def destroy() -> None:
process_manager.stop()
while process_manager.is_processing():
sleep(0.5)
if facefusion.globals.target_path: if facefusion.globals.target_path:
clear_temp(facefusion.globals.target_path) clear_temp(facefusion.globals.target_path)
sys.exit(0) sys.exit(0)
@ -212,7 +222,7 @@ def pre_check() -> bool:
def conditional_process() -> None: def conditional_process() -> None:
start_time = time.time() start_time = time()
for frame_processor_module in get_frame_processors_modules(facefusion.globals.frame_processors): for frame_processor_module in get_frame_processors_modules(facefusion.globals.frame_processors):
while not frame_processor_module.post_check(): while not frame_processor_module.post_check():
logger.disable() logger.disable()
@ -247,28 +257,43 @@ def conditional_append_reference_faces() -> None:
def process_image(start_time : float) -> None: def process_image(start_time : float) -> None:
normed_output_path = normalize_output_path(facefusion.globals.target_path, facefusion.globals.output_path)
if analyse_image(facefusion.globals.target_path): if analyse_image(facefusion.globals.target_path):
return return
shutil.copy2(facefusion.globals.target_path, facefusion.globals.output_path) # copy image
# process frame process_manager.start()
temp_image_resolution = pack_resolution(restrict_image_resolution(facefusion.globals.target_path, unpack_resolution(facefusion.globals.output_image_resolution)))
logger.info(wording.get('copying_image').format(resolution = temp_image_resolution), __name__.upper())
if copy_image(facefusion.globals.target_path, normed_output_path, temp_image_resolution):
logger.debug(wording.get('copying_image_succeed'), __name__.upper())
else:
logger.error(wording.get('copying_image_failed'), __name__.upper())
return
# process image
for frame_processor_module in get_frame_processors_modules(facefusion.globals.frame_processors): for frame_processor_module in get_frame_processors_modules(facefusion.globals.frame_processors):
logger.info(wording.get('processing'), frame_processor_module.NAME) logger.info(wording.get('processing'), frame_processor_module.NAME)
frame_processor_module.process_image(facefusion.globals.source_paths, facefusion.globals.output_path, facefusion.globals.output_path) frame_processor_module.process_image(facefusion.globals.source_paths, normed_output_path, normed_output_path)
frame_processor_module.post_process() frame_processor_module.post_process()
# compress image if is_process_stopping():
if compress_image(facefusion.globals.output_path): return
logger.info(wording.get('compressing_image_succeed'), __name__.upper()) # finalize image
logger.info(wording.get('finalizing_image').format(resolution = facefusion.globals.output_image_resolution), __name__.upper())
if finalize_image(normed_output_path, facefusion.globals.output_image_resolution):
logger.debug(wording.get('finalizing_image_succeed'), __name__.upper())
else: else:
logger.warn(wording.get('compressing_image_skipped'), __name__.upper()) logger.warn(wording.get('finalizing_image_skipped'), __name__.upper())
# validate image # validate image
if is_image(facefusion.globals.output_path): if is_image(normed_output_path):
seconds = '{:.2f}'.format((time.time() - start_time) % 60) seconds = '{:.2f}'.format((time() - start_time) % 60)
logger.info(wording.get('processing_image_succeed').format(seconds = seconds), __name__.upper()) logger.info(wording.get('processing_image_succeed').format(seconds = seconds), __name__.upper())
conditional_log_statistics()
else: else:
logger.error(wording.get('processing_image_failed'), __name__.upper()) logger.error(wording.get('processing_image_failed'), __name__.upper())
process_manager.end()
def process_video(start_time : float) -> None: def process_video(start_time : float) -> None:
normed_output_path = normalize_output_path(facefusion.globals.target_path, facefusion.globals.output_path)
if analyse_video(facefusion.globals.target_path, facefusion.globals.trim_frame_start, facefusion.globals.trim_frame_end): if analyse_video(facefusion.globals.target_path, facefusion.globals.trim_frame_start, facefusion.globals.trim_frame_end):
return return
# clear temp # clear temp
@ -278,47 +303,75 @@ def process_video(start_time : float) -> None:
logger.debug(wording.get('creating_temp'), __name__.upper()) logger.debug(wording.get('creating_temp'), __name__.upper())
create_temp(facefusion.globals.target_path) create_temp(facefusion.globals.target_path)
# extract frames # extract frames
logger.info(wording.get('extracting_frames_fps').format(video_fps = facefusion.globals.output_video_fps), __name__.upper()) process_manager.start()
extract_frames(facefusion.globals.target_path, facefusion.globals.output_video_resolution, facefusion.globals.output_video_fps) temp_video_resolution = pack_resolution(restrict_video_resolution(facefusion.globals.target_path, unpack_resolution(facefusion.globals.output_video_resolution)))
# process frame temp_video_fps = restrict_video_fps(facefusion.globals.target_path, facefusion.globals.output_video_fps)
logger.info(wording.get('extracting_frames').format(resolution = temp_video_resolution, fps = temp_video_fps), __name__.upper())
if extract_frames(facefusion.globals.target_path, temp_video_resolution, temp_video_fps):
logger.debug(wording.get('extracting_frames_succeed'), __name__.upper())
else:
if is_process_stopping():
return
logger.error(wording.get('extracting_frames_failed'), __name__.upper())
return
# process frames
temp_frame_paths = get_temp_frame_paths(facefusion.globals.target_path) temp_frame_paths = get_temp_frame_paths(facefusion.globals.target_path)
if temp_frame_paths: if temp_frame_paths:
for frame_processor_module in get_frame_processors_modules(facefusion.globals.frame_processors): for frame_processor_module in get_frame_processors_modules(facefusion.globals.frame_processors):
logger.info(wording.get('processing'), frame_processor_module.NAME) logger.info(wording.get('processing'), frame_processor_module.NAME)
frame_processor_module.process_video(facefusion.globals.source_paths, temp_frame_paths) frame_processor_module.process_video(facefusion.globals.source_paths, temp_frame_paths)
frame_processor_module.post_process() frame_processor_module.post_process()
if is_process_stopping():
return
else: else:
logger.error(wording.get('temp_frames_not_found'), __name__.upper()) logger.error(wording.get('temp_frames_not_found'), __name__.upper())
return return
# merge video # merge video
logger.info(wording.get('merging_video_fps').format(video_fps = facefusion.globals.output_video_fps), __name__.upper()) logger.info(wording.get('merging_video').format(resolution = facefusion.globals.output_video_resolution, fps = facefusion.globals.output_video_fps), __name__.upper())
if not merge_video(facefusion.globals.target_path, facefusion.globals.output_video_resolution, facefusion.globals.output_video_fps): if merge_video(facefusion.globals.target_path, facefusion.globals.output_video_resolution, facefusion.globals.output_video_fps):
logger.debug(wording.get('merging_video_succeed'), __name__.upper())
else:
if is_process_stopping():
return
logger.error(wording.get('merging_video_failed'), __name__.upper()) logger.error(wording.get('merging_video_failed'), __name__.upper())
return return
# handle audio # handle audio
if facefusion.globals.skip_audio: if facefusion.globals.skip_audio:
logger.info(wording.get('skipping_audio'), __name__.upper()) logger.info(wording.get('skipping_audio'), __name__.upper())
move_temp(facefusion.globals.target_path, facefusion.globals.output_path) move_temp(facefusion.globals.target_path, normed_output_path)
else: else:
if 'lip_syncer' in facefusion.globals.frame_processors: if 'lip_syncer' in facefusion.globals.frame_processors:
source_audio_path = get_first(filter_audio_paths(facefusion.globals.source_paths)) source_audio_path = get_first(filter_audio_paths(facefusion.globals.source_paths))
if source_audio_path and replace_audio(facefusion.globals.target_path, source_audio_path, facefusion.globals.output_path): if source_audio_path and replace_audio(facefusion.globals.target_path, source_audio_path, normed_output_path):
logger.info(wording.get('restoring_audio_succeed'), __name__.upper()) logger.debug(wording.get('restoring_audio_succeed'), __name__.upper())
else: else:
if is_process_stopping():
return
logger.warn(wording.get('restoring_audio_skipped'), __name__.upper()) logger.warn(wording.get('restoring_audio_skipped'), __name__.upper())
move_temp(facefusion.globals.target_path, facefusion.globals.output_path) move_temp(facefusion.globals.target_path, normed_output_path)
else: else:
if restore_audio(facefusion.globals.target_path, facefusion.globals.output_path, facefusion.globals.output_video_fps): if restore_audio(facefusion.globals.target_path, normed_output_path, facefusion.globals.output_video_fps):
logger.info(wording.get('restoring_audio_succeed'), __name__.upper()) logger.debug(wording.get('restoring_audio_succeed'), __name__.upper())
else: else:
if is_process_stopping():
return
logger.warn(wording.get('restoring_audio_skipped'), __name__.upper()) logger.warn(wording.get('restoring_audio_skipped'), __name__.upper())
move_temp(facefusion.globals.target_path, facefusion.globals.output_path) move_temp(facefusion.globals.target_path, normed_output_path)
# clear temp # clear temp
logger.debug(wording.get('clearing_temp'), __name__.upper()) logger.debug(wording.get('clearing_temp'), __name__.upper())
clear_temp(facefusion.globals.target_path) clear_temp(facefusion.globals.target_path)
# validate video # validate video
if is_video(facefusion.globals.output_path): if is_video(normed_output_path):
seconds = '{:.2f}'.format((time.time() - start_time)) seconds = '{:.2f}'.format((time() - start_time))
logger.info(wording.get('processing_video_succeed').format(seconds = seconds), __name__.upper()) logger.info(wording.get('processing_video_succeed').format(seconds = seconds), __name__.upper())
conditional_log_statistics()
else: else:
logger.error(wording.get('processing_video_failed'), __name__.upper()) logger.error(wording.get('processing_video_failed'), __name__.upper())
process_manager.end()
def is_process_stopping() -> bool:
if process_manager.is_stopping():
process_manager.end()
logger.info(wording.get('processing_stopped'), __name__.upper())
return process_manager.is_pending()

View File

@ -32,6 +32,9 @@ def conditional_download(download_directory_path : str, urls : List[str]) -> Non
if is_file(download_file_path): if is_file(download_file_path):
current = os.path.getsize(download_file_path) current = os.path.getsize(download_file_path)
progress.update(current - progress.n) progress.update(current - progress.n)
if not is_download_done(url, download_file_path):
os.remove(download_file_path)
conditional_download(download_directory_path, [ url] )
@lru_cache(maxsize = None) @lru_cache(maxsize = None)

97
facefusion/execution.py Normal file
View File

@ -0,0 +1,97 @@
from typing import List, Any
from functools import lru_cache
import subprocess
import xml.etree.ElementTree as ElementTree
import onnxruntime
from facefusion.typing import ExecutionDevice, ValueAndUnit
def encode_execution_providers(execution_providers : List[str]) -> List[str]:
return [ execution_provider.replace('ExecutionProvider', '').lower() for execution_provider in execution_providers ]
def decode_execution_providers(execution_providers: List[str]) -> List[str]:
available_execution_providers = onnxruntime.get_available_providers()
encoded_execution_providers = encode_execution_providers(available_execution_providers)
return [ execution_provider for execution_provider, encoded_execution_provider in zip(available_execution_providers, encoded_execution_providers) if any(execution_provider in encoded_execution_provider for execution_provider in execution_providers) ]
def apply_execution_provider_options(execution_providers: List[str]) -> List[Any]:
execution_providers_with_options : List[Any] = []
for execution_provider in execution_providers:
if execution_provider == 'CUDAExecutionProvider':
execution_providers_with_options.append((execution_provider,
{
'cudnn_conv_algo_search': 'EXHAUSTIVE' if use_exhaustive() else 'DEFAULT'
}))
else:
execution_providers_with_options.append(execution_provider)
return execution_providers_with_options
def use_exhaustive() -> bool:
execution_devices = detect_static_execution_devices()
product_names = ('GeForce GTX 1630', 'GeForce GTX 1650', 'GeForce GTX 1660')
return any(execution_device.get('product').get('name').startswith(product_names) for execution_device in execution_devices)
def run_nvidia_smi() -> subprocess.Popen[bytes]:
commands = [ 'nvidia-smi', '--query', '--xml-format' ]
return subprocess.Popen(commands, stdout = subprocess.PIPE)
@lru_cache(maxsize = None)
def detect_static_execution_devices() -> List[ExecutionDevice]:
return detect_execution_devices()
def detect_execution_devices() -> List[ExecutionDevice]:
execution_devices : List[ExecutionDevice] = []
try:
output, _ = run_nvidia_smi().communicate()
root_element = ElementTree.fromstring(output)
except Exception:
root_element = ElementTree.Element('xml')
for gpu_element in root_element.findall('gpu'):
execution_devices.append(
{
'driver_version': root_element.find('driver_version').text,
'framework':
{
'name': 'CUDA',
'version': root_element.find('cuda_version').text,
},
'product':
{
'vendor': 'NVIDIA',
'name': gpu_element.find('product_name').text.replace('NVIDIA ', ''),
'architecture': gpu_element.find('product_architecture').text,
},
'video_memory':
{
'total': create_value_and_unit(gpu_element.find('fb_memory_usage/total').text),
'free': create_value_and_unit(gpu_element.find('fb_memory_usage/free').text)
},
'utilization':
{
'gpu': create_value_and_unit(gpu_element.find('utilization/gpu_util').text),
'memory': create_value_and_unit(gpu_element.find('utilization/memory_util').text)
}
})
return execution_devices
def create_value_and_unit(text : str) -> ValueAndUnit:
value, unit = text.split()
value_and_unit : ValueAndUnit =\
{
'value': value,
'unit': unit
}
return value_and_unit

View File

@ -1,37 +0,0 @@
from typing import Any, List
import onnxruntime
def encode_execution_providers(execution_providers : List[str]) -> List[str]:
return [ execution_provider.replace('ExecutionProvider', '').lower() for execution_provider in execution_providers ]
def decode_execution_providers(execution_providers: List[str]) -> List[str]:
available_execution_providers = onnxruntime.get_available_providers()
encoded_execution_providers = encode_execution_providers(available_execution_providers)
return [ execution_provider for execution_provider, encoded_execution_provider in zip(available_execution_providers, encoded_execution_providers) if any(execution_provider in encoded_execution_provider for execution_provider in execution_providers) ]
def apply_execution_provider_options(execution_providers: List[str]) -> List[Any]:
execution_providers_with_options : List[Any] = []
for execution_provider in execution_providers:
if execution_provider == 'CUDAExecutionProvider':
execution_providers_with_options.append((execution_provider,
{
'cudnn_conv_algo_search': 'DEFAULT'
}))
else:
execution_providers_with_options.append(execution_provider)
return execution_providers_with_options
def map_torch_backend(execution_providers : List[str]) -> str:
if 'CoreMLExecutionProvider' in execution_providers:
return 'mps'
if 'CUDAExecutionProvider' in execution_providers or 'ROCMExecutionProvider' in execution_providers :
return 'cuda'
if 'OpenVINOExecutionProvider' in execution_providers:
return 'mkl'
return 'cpu'

View File

@ -8,10 +8,10 @@ import facefusion.globals
from facefusion.common_helper import get_first from facefusion.common_helper import get_first
from facefusion.face_helper import warp_face_by_face_landmark_5, warp_face_by_translation, create_static_anchors, distance_to_face_landmark_5, distance_to_bounding_box, convert_face_landmark_68_to_5, apply_nms, categorize_age, categorize_gender from facefusion.face_helper import warp_face_by_face_landmark_5, warp_face_by_translation, create_static_anchors, distance_to_face_landmark_5, distance_to_bounding_box, convert_face_landmark_68_to_5, apply_nms, categorize_age, categorize_gender
from facefusion.face_store import get_static_faces, set_static_faces from facefusion.face_store import get_static_faces, set_static_faces
from facefusion.execution_helper import apply_execution_provider_options from facefusion.execution import apply_execution_provider_options
from facefusion.download import conditional_download from facefusion.download import conditional_download
from facefusion.filesystem import resolve_relative_path from facefusion.filesystem import resolve_relative_path
from facefusion.typing import VisionFrame, Face, FaceSet, FaceAnalyserOrder, FaceAnalyserAge, FaceAnalyserGender, ModelSet, BoundingBox, FaceLandmarkSet, FaceLandmark5, FaceLandmark68, Score, Embedding from facefusion.typing import VisionFrame, Face, FaceSet, FaceAnalyserOrder, FaceAnalyserAge, FaceAnalyserGender, ModelSet, BoundingBox, FaceLandmarkSet, FaceLandmark5, FaceLandmark68, Score, FaceScoreSet, Embedding
from facefusion.vision import resize_frame_resolution, unpack_resolution from facefusion.vision import resize_frame_resolution, unpack_resolution
FACE_ANALYSER = None FACE_ANALYSER = None
@ -24,6 +24,11 @@ MODELS : ModelSet =\
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/retinaface_10g.onnx', 'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/retinaface_10g.onnx',
'path': resolve_relative_path('../.assets/models/retinaface_10g.onnx') 'path': resolve_relative_path('../.assets/models/retinaface_10g.onnx')
}, },
'face_detector_scrfd':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/scrfd_2.5g.onnx',
'path': resolve_relative_path('../.assets/models/scrfd_2.5g.onnx')
},
'face_detector_yoloface': 'face_detector_yoloface':
{ {
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/yoloface_8n.onnx', 'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/yoloface_8n.onnx',
@ -70,14 +75,21 @@ MODELS : ModelSet =\
def get_face_analyser() -> Any: def get_face_analyser() -> Any:
global FACE_ANALYSER global FACE_ANALYSER
face_detectors = {}
with THREAD_LOCK: with THREAD_LOCK:
if FACE_ANALYSER is None: if FACE_ANALYSER is None:
if facefusion.globals.face_detector_model == 'retinaface': if facefusion.globals.face_detector_model in [ 'many', 'retinaface' ]:
face_detector = onnxruntime.InferenceSession(MODELS.get('face_detector_retinaface').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers)) face_detector = onnxruntime.InferenceSession(MODELS.get('face_detector_retinaface').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
if facefusion.globals.face_detector_model == 'yoloface': face_detectors['retinaface'] = face_detector
if facefusion.globals.face_detector_model in [ 'many', 'scrfd' ]:
face_detector = onnxruntime.InferenceSession(MODELS.get('face_detector_scrfd').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
face_detectors['scrfd'] = face_detector
if facefusion.globals.face_detector_model in [ 'many', 'yoloface' ]:
face_detector = onnxruntime.InferenceSession(MODELS.get('face_detector_yoloface').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers)) face_detector = onnxruntime.InferenceSession(MODELS.get('face_detector_yoloface').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
if facefusion.globals.face_detector_model == 'yunet': face_detectors['yoloface'] = face_detector
if facefusion.globals.face_detector_model in [ 'yunet' ]:
face_detector = cv2.FaceDetectorYN.create(MODELS.get('face_detector_yunet').get('path'), '', (0, 0)) face_detector = cv2.FaceDetectorYN.create(MODELS.get('face_detector_yunet').get('path'), '', (0, 0))
face_detectors['yunet'] = face_detector
if facefusion.globals.face_recognizer_model == 'arcface_blendswap': if facefusion.globals.face_recognizer_model == 'arcface_blendswap':
face_recognizer = onnxruntime.InferenceSession(MODELS.get('face_recognizer_arcface_blendswap').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers)) face_recognizer = onnxruntime.InferenceSession(MODELS.get('face_recognizer_arcface_blendswap').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
if facefusion.globals.face_recognizer_model == 'arcface_inswapper': if facefusion.globals.face_recognizer_model == 'arcface_inswapper':
@ -90,7 +102,7 @@ def get_face_analyser() -> Any:
gender_age = onnxruntime.InferenceSession(MODELS.get('gender_age').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers)) gender_age = onnxruntime.InferenceSession(MODELS.get('gender_age').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
FACE_ANALYSER =\ FACE_ANALYSER =\
{ {
'face_detector': face_detector, 'face_detectors': face_detectors,
'face_recognizer': face_recognizer, 'face_recognizer': face_recognizer,
'face_landmarker': face_landmarker, 'face_landmarker': face_landmarker,
'gender_age': gender_age 'gender_age': gender_age
@ -110,6 +122,7 @@ def pre_check() -> bool:
model_urls =\ model_urls =\
[ [
MODELS.get('face_detector_retinaface').get('url'), MODELS.get('face_detector_retinaface').get('url'),
MODELS.get('face_detector_scrfd').get('url'),
MODELS.get('face_detector_yoloface').get('url'), MODELS.get('face_detector_yoloface').get('url'),
MODELS.get('face_detector_yunet').get('url'), MODELS.get('face_detector_yunet').get('url'),
MODELS.get('face_recognizer_arcface_blendswap').get('url'), MODELS.get('face_recognizer_arcface_blendswap').get('url'),
@ -124,22 +137,23 @@ def pre_check() -> bool:
def detect_with_retinaface(vision_frame : VisionFrame, face_detector_size : str) -> Tuple[List[BoundingBox], List[FaceLandmark5], List[Score]]: def detect_with_retinaface(vision_frame : VisionFrame, face_detector_size : str) -> Tuple[List[BoundingBox], List[FaceLandmark5], List[Score]]:
face_detector = get_face_analyser().get('face_detector') face_detector = get_face_analyser().get('face_detectors').get('retinaface')
face_detector_width, face_detector_height = unpack_resolution(face_detector_size) face_detector_width, face_detector_height = unpack_resolution(face_detector_size)
temp_vision_frame = resize_frame_resolution(vision_frame, face_detector_width, face_detector_height) temp_vision_frame = resize_frame_resolution(vision_frame, (face_detector_width, face_detector_height))
ratio_height = vision_frame.shape[0] / temp_vision_frame.shape[0] ratio_height = vision_frame.shape[0] / temp_vision_frame.shape[0]
ratio_width = vision_frame.shape[1] / temp_vision_frame.shape[1] ratio_width = vision_frame.shape[1] / temp_vision_frame.shape[1]
feature_strides = [ 8, 16, 32 ] feature_strides = [ 8, 16, 32 ]
feature_map_channel = 3 feature_map_channel = 3
anchor_total = 2 anchor_total = 2
bounding_box_list = [] bounding_box_list = []
face_landmark5_list = [] face_landmark_5_list = []
score_list = [] score_list = []
detect_vision_frame = prepare_detect_frame(temp_vision_frame, face_detector_size)
with THREAD_SEMAPHORE: with THREAD_SEMAPHORE:
detections = face_detector.run(None, detections = face_detector.run(None,
{ {
face_detector.get_inputs()[0].name: prepare_detect_frame(temp_vision_frame, face_detector_size) face_detector.get_inputs()[0].name: detect_vision_frame
}) })
for index, feature_stride in enumerate(feature_strides): for index, feature_stride in enumerate(feature_strides):
keep_indices = numpy.where(detections[index] >= facefusion.globals.face_detector_score)[0] keep_indices = numpy.where(detections[index] >= facefusion.globals.face_detector_score)[0]
@ -157,27 +171,70 @@ def detect_with_retinaface(vision_frame : VisionFrame, face_detector_size : str)
bounding_box[2] * ratio_width, bounding_box[2] * ratio_width,
bounding_box[3] * ratio_height bounding_box[3] * ratio_height
])) ]))
for face_landmark5 in distance_to_face_landmark_5(anchors, face_landmark_5_raw)[keep_indices]: for face_landmark_5 in distance_to_face_landmark_5(anchors, face_landmark_5_raw)[keep_indices]:
face_landmark5_list.append(face_landmark5 * [ ratio_width, ratio_height ]) face_landmark_5_list.append(face_landmark_5 * [ ratio_width, ratio_height ])
for score in detections[index][keep_indices]: for score in detections[index][keep_indices]:
score_list.append(score[0]) score_list.append(score[0])
return bounding_box_list, face_landmark5_list, score_list return bounding_box_list, face_landmark_5_list, score_list
def detect_with_yoloface(vision_frame : VisionFrame, face_detector_size : str) -> Tuple[List[BoundingBox], List[FaceLandmark5], List[Score]]: def detect_with_scrfd(vision_frame : VisionFrame, face_detector_size : str) -> Tuple[List[BoundingBox], List[FaceLandmark5], List[Score]]:
face_detector = get_face_analyser().get('face_detector') face_detector = get_face_analyser().get('face_detectors').get('scrfd')
face_detector_width, face_detector_height = unpack_resolution(face_detector_size) face_detector_width, face_detector_height = unpack_resolution(face_detector_size)
temp_vision_frame = resize_frame_resolution(vision_frame, face_detector_width, face_detector_height) temp_vision_frame = resize_frame_resolution(vision_frame, (face_detector_width, face_detector_height))
ratio_height = vision_frame.shape[0] / temp_vision_frame.shape[0] ratio_height = vision_frame.shape[0] / temp_vision_frame.shape[0]
ratio_width = vision_frame.shape[1] / temp_vision_frame.shape[1] ratio_width = vision_frame.shape[1] / temp_vision_frame.shape[1]
feature_strides = [ 8, 16, 32 ]
feature_map_channel = 3
anchor_total = 2
bounding_box_list = [] bounding_box_list = []
face_landmark5_list = [] face_landmark_5_list = []
score_list = [] score_list = []
detect_vision_frame = prepare_detect_frame(temp_vision_frame, face_detector_size)
with THREAD_SEMAPHORE: with THREAD_SEMAPHORE:
detections = face_detector.run(None, detections = face_detector.run(None,
{ {
face_detector.get_inputs()[0].name: prepare_detect_frame(temp_vision_frame, face_detector_size) face_detector.get_inputs()[0].name: detect_vision_frame
})
for index, feature_stride in enumerate(feature_strides):
keep_indices = numpy.where(detections[index] >= facefusion.globals.face_detector_score)[0]
if keep_indices.any():
stride_height = face_detector_height // feature_stride
stride_width = face_detector_width // feature_stride
anchors = create_static_anchors(feature_stride, anchor_total, stride_height, stride_width)
bounding_box_raw = detections[index + feature_map_channel] * feature_stride
face_landmark_5_raw = detections[index + feature_map_channel * 2] * feature_stride
for bounding_box in distance_to_bounding_box(anchors, bounding_box_raw)[keep_indices]:
bounding_box_list.append(numpy.array(
[
bounding_box[0] * ratio_width,
bounding_box[1] * ratio_height,
bounding_box[2] * ratio_width,
bounding_box[3] * ratio_height
]))
for face_landmark_5 in distance_to_face_landmark_5(anchors, face_landmark_5_raw)[keep_indices]:
face_landmark_5_list.append(face_landmark_5 * [ ratio_width, ratio_height ])
for score in detections[index][keep_indices]:
score_list.append(score[0])
return bounding_box_list, face_landmark_5_list, score_list
def detect_with_yoloface(vision_frame : VisionFrame, face_detector_size : str) -> Tuple[List[BoundingBox], List[FaceLandmark5], List[Score]]:
face_detector = get_face_analyser().get('face_detectors').get('yoloface')
face_detector_width, face_detector_height = unpack_resolution(face_detector_size)
temp_vision_frame = resize_frame_resolution(vision_frame, (face_detector_width, face_detector_height))
ratio_height = vision_frame.shape[0] / temp_vision_frame.shape[0]
ratio_width = vision_frame.shape[1] / temp_vision_frame.shape[1]
bounding_box_list = []
face_landmark_5_list = []
score_list = []
detect_vision_frame = prepare_detect_frame(temp_vision_frame, face_detector_size)
with THREAD_SEMAPHORE:
detections = face_detector.run(None,
{
face_detector.get_inputs()[0].name: detect_vision_frame
}) })
detections = numpy.squeeze(detections).T detections = numpy.squeeze(detections).T
bounding_box_raw, score_raw, face_landmark_5_raw = numpy.split(detections, [ 4, 5 ], axis = 1) bounding_box_raw, score_raw, face_landmark_5_raw = numpy.split(detections, [ 4, 5 ], axis = 1)
@ -195,26 +252,26 @@ def detect_with_yoloface(vision_frame : VisionFrame, face_detector_size : str) -
face_landmark_5_raw[:, 0::3] = (face_landmark_5_raw[:, 0::3]) * ratio_width face_landmark_5_raw[:, 0::3] = (face_landmark_5_raw[:, 0::3]) * ratio_width
face_landmark_5_raw[:, 1::3] = (face_landmark_5_raw[:, 1::3]) * ratio_height face_landmark_5_raw[:, 1::3] = (face_landmark_5_raw[:, 1::3]) * ratio_height
for face_landmark_5 in face_landmark_5_raw: for face_landmark_5 in face_landmark_5_raw:
face_landmark5_list.append(numpy.array(face_landmark_5.reshape(-1, 3)[:, :2])) face_landmark_5_list.append(numpy.array(face_landmark_5.reshape(-1, 3)[:, :2]))
score_list = score_raw.ravel().tolist() score_list = score_raw.ravel().tolist()
return bounding_box_list, face_landmark5_list, score_list return bounding_box_list, face_landmark_5_list, score_list
def detect_with_yunet(vision_frame : VisionFrame, face_detector_size : str) -> Tuple[List[BoundingBox], List[FaceLandmark5], List[Score]]: def detect_with_yunet(vision_frame : VisionFrame, face_detector_size : str) -> Tuple[List[BoundingBox], List[FaceLandmark5], List[Score]]:
face_detector = get_face_analyser().get('face_detector') face_detector = get_face_analyser().get('face_detectors').get('yunet')
face_detector_width, face_detector_height = unpack_resolution(face_detector_size) face_detector_width, face_detector_height = unpack_resolution(face_detector_size)
temp_vision_frame = resize_frame_resolution(vision_frame, face_detector_width, face_detector_height) temp_vision_frame = resize_frame_resolution(vision_frame, (face_detector_width, face_detector_height))
ratio_height = vision_frame.shape[0] / temp_vision_frame.shape[0] ratio_height = vision_frame.shape[0] / temp_vision_frame.shape[0]
ratio_width = vision_frame.shape[1] / temp_vision_frame.shape[1] ratio_width = vision_frame.shape[1] / temp_vision_frame.shape[1]
bounding_box_list = [] bounding_box_list = []
face_landmark5_list = [] face_landmark_5_list = []
score_list = [] score_list = []
face_detector.setInputSize((temp_vision_frame.shape[1], temp_vision_frame.shape[0])) face_detector.setInputSize((temp_vision_frame.shape[1], temp_vision_frame.shape[0]))
face_detector.setScoreThreshold(facefusion.globals.face_detector_score) face_detector.setScoreThreshold(facefusion.globals.face_detector_score)
with THREAD_SEMAPHORE: with THREAD_SEMAPHORE:
_, detections = face_detector.detect(temp_vision_frame) _, detections = face_detector.detect(temp_vision_frame)
if detections.any(): if numpy.any(detections):
for detection in detections: for detection in detections:
bounding_box_list.append(numpy.array( bounding_box_list.append(numpy.array(
[ [
@ -223,9 +280,9 @@ def detect_with_yunet(vision_frame : VisionFrame, face_detector_size : str) -> T
(detection[0] + detection[2]) * ratio_width, (detection[0] + detection[2]) * ratio_width,
(detection[1] + detection[3]) * ratio_height (detection[1] + detection[3]) * ratio_height
])) ]))
face_landmark5_list.append(detection[4:14].reshape((5, 2)) * [ ratio_width, ratio_height ]) face_landmark_5_list.append(detection[4:14].reshape((5, 2)) * [ ratio_width, ratio_height ])
score_list.append(detection[14]) score_list.append(detection[14])
return bounding_box_list, face_landmark5_list, score_list return bounding_box_list, face_landmark_5_list, score_list
def prepare_detect_frame(temp_vision_frame : VisionFrame, face_detector_size : str) -> VisionFrame: def prepare_detect_frame(temp_vision_frame : VisionFrame, face_detector_size : str) -> VisionFrame:
@ -237,30 +294,41 @@ def prepare_detect_frame(temp_vision_frame : VisionFrame, face_detector_size : s
return detect_vision_frame return detect_vision_frame
def create_faces(vision_frame : VisionFrame, bounding_box_list : List[BoundingBox], face_landmark5_list : List[FaceLandmark5], score_list : List[Score]) -> List[Face]: def create_faces(vision_frame : VisionFrame, bounding_box_list : List[BoundingBox], face_landmark_5_list : List[FaceLandmark5], score_list : List[Score]) -> List[Face]:
faces = [] faces = []
if facefusion.globals.face_detector_score > 0: if facefusion.globals.face_detector_score > 0:
sort_indices = numpy.argsort(-numpy.array(score_list)) sort_indices = numpy.argsort(-numpy.array(score_list))
bounding_box_list = [ bounding_box_list[index] for index in sort_indices ] bounding_box_list = [ bounding_box_list[index] for index in sort_indices ]
face_landmark5_list = [ face_landmark5_list[index] for index in sort_indices ] face_landmark_5_list = [face_landmark_5_list[index] for index in sort_indices]
score_list = [ score_list[index] for index in sort_indices ] score_list = [ score_list[index] for index in sort_indices ]
keep_indices = apply_nms(bounding_box_list, 0.4) iou_threshold = 0.1 if facefusion.globals.face_detector_model == 'many' else 0.4
keep_indices = apply_nms(bounding_box_list, iou_threshold)
for index in keep_indices: for index in keep_indices:
bounding_box = bounding_box_list[index] bounding_box = bounding_box_list[index]
face_landmark_68 = detect_face_landmark_68(vision_frame, bounding_box) face_landmark_5_68 = face_landmark_5_list[index]
landmark : FaceLandmarkSet =\ face_landmark_68 = None
face_landmark_68_score = 0.0
if facefusion.globals.face_landmarker_score > 0:
face_landmark_68, face_landmark_68_score = detect_face_landmark_68(vision_frame, bounding_box)
if face_landmark_68_score > facefusion.globals.face_landmarker_score:
face_landmark_5_68 = convert_face_landmark_68_to_5(face_landmark_68)
landmarks : FaceLandmarkSet =\
{ {
'5': face_landmark5_list[index], '5': face_landmark_5_list[index],
'5/68': convert_face_landmark_68_to_5(face_landmark_68), '5/68': face_landmark_5_68,
'68': face_landmark_68 '68': face_landmark_68
} }
score = score_list[index] scores : FaceScoreSet = \
embedding, normed_embedding = calc_embedding(vision_frame, landmark['5/68']) {
'detector': score_list[index],
'landmarker': face_landmark_68_score
}
embedding, normed_embedding = calc_embedding(vision_frame, landmarks.get('5/68'))
gender, age = detect_gender_age(vision_frame, bounding_box) gender, age = detect_gender_age(vision_frame, bounding_box)
faces.append(Face( faces.append(Face(
bounding_box = bounding_box, bounding_box = bounding_box,
landmark = landmark, landmarks = landmarks,
score = score, scores = scores,
embedding = embedding, embedding = embedding,
normed_embedding = normed_embedding, normed_embedding = normed_embedding,
gender = gender, gender = gender,
@ -284,21 +352,27 @@ def calc_embedding(temp_vision_frame : VisionFrame, face_landmark_5 : FaceLandma
return embedding, normed_embedding return embedding, normed_embedding
def detect_face_landmark_68(temp_vision_frame : VisionFrame, bounding_box : BoundingBox) -> FaceLandmark68: def detect_face_landmark_68(temp_vision_frame : VisionFrame, bounding_box : BoundingBox) -> Tuple[FaceLandmark68, Score]:
face_landmarker = get_face_analyser().get('face_landmarker') face_landmarker = get_face_analyser().get('face_landmarker')
scale = 195 / numpy.subtract(bounding_box[2:], bounding_box[:2]).max() scale = 195 / numpy.subtract(bounding_box[2:], bounding_box[:2]).max()
translation = (256 - numpy.add(bounding_box[2:], bounding_box[:2]) * scale) * 0.5 translation = (256 - numpy.add(bounding_box[2:], bounding_box[:2]) * scale) * 0.5
crop_vision_frame, affine_matrix = warp_face_by_translation(temp_vision_frame, translation, scale, (256, 256)) crop_vision_frame, affine_matrix = warp_face_by_translation(temp_vision_frame, translation, scale, (256, 256))
crop_vision_frame = cv2.cvtColor(crop_vision_frame, cv2.COLOR_RGB2Lab)
if numpy.mean(crop_vision_frame[:, :, 0]) < 30:
crop_vision_frame[:, :, 0] = cv2.createCLAHE(clipLimit = 2).apply(crop_vision_frame[:, :, 0])
crop_vision_frame = cv2.cvtColor(crop_vision_frame, cv2.COLOR_Lab2RGB)
crop_vision_frame = crop_vision_frame.transpose(2, 0, 1).astype(numpy.float32) / 255.0 crop_vision_frame = crop_vision_frame.transpose(2, 0, 1).astype(numpy.float32) / 255.0
face_landmark_68 = face_landmarker.run(None, face_landmark_68, face_heatmap = face_landmarker.run(None,
{ {
face_landmarker.get_inputs()[0].name: [ crop_vision_frame ] face_landmarker.get_inputs()[0].name: [ crop_vision_frame ]
})[0] })
face_landmark_68 = face_landmark_68[:, :, :2][0] / 64 face_landmark_68 = face_landmark_68[:, :, :2][0] / 64
face_landmark_68 = face_landmark_68.reshape(1, -1, 2) * 256 face_landmark_68 = face_landmark_68.reshape(1, -1, 2) * 256
face_landmark_68 = cv2.transform(face_landmark_68, cv2.invertAffineTransform(affine_matrix)) face_landmark_68 = cv2.transform(face_landmark_68, cv2.invertAffineTransform(affine_matrix))
face_landmark_68 = face_landmark_68.reshape(-1, 2) face_landmark_68 = face_landmark_68.reshape(-1, 2)
return face_landmark_68 face_landmark_68_score = numpy.amax(face_heatmap, axis = (2, 3))
face_landmark_68_score = numpy.mean(face_landmark_68_score)
return face_landmark_68, face_landmark_68_score
def detect_gender_age(temp_vision_frame : VisionFrame, bounding_box : BoundingBox) -> Tuple[int, int]: def detect_gender_age(temp_vision_frame : VisionFrame, bounding_box : BoundingBox) -> Tuple[int, int]:
@ -344,8 +418,8 @@ def get_average_face(vision_frames : List[VisionFrame], position : int = 0) -> O
first_face = get_first(faces) first_face = get_first(faces)
average_face = Face( average_face = Face(
bounding_box = first_face.bounding_box, bounding_box = first_face.bounding_box,
landmark = first_face.landmark, landmarks = first_face.landmarks,
score = first_face.score, scores = first_face.scores,
embedding = numpy.mean(embedding_list, axis = 0), embedding = numpy.mean(embedding_list, axis = 0),
normed_embedding = numpy.mean(normed_embedding_list, axis = 0), normed_embedding = numpy.mean(normed_embedding_list, axis = 0),
gender = first_face.gender, gender = first_face.gender,
@ -361,15 +435,32 @@ def get_many_faces(vision_frame : VisionFrame) -> List[Face]:
if faces_cache: if faces_cache:
faces = faces_cache faces = faces_cache
else: else:
if facefusion.globals.face_detector_model == 'retinaface': bounding_box_list = []
bounding_box_list, face_landmark5_list, score_list = detect_with_retinaface(vision_frame, facefusion.globals.face_detector_size) face_landmark_5_list = []
faces = create_faces(vision_frame, bounding_box_list, face_landmark5_list, score_list) score_list = []
if facefusion.globals.face_detector_model == 'yoloface':
bounding_box_list, face_landmark5_list, score_list = detect_with_yoloface(vision_frame, facefusion.globals.face_detector_size) if facefusion.globals.face_detector_model in [ 'many', 'retinaface']:
faces = create_faces(vision_frame, bounding_box_list, face_landmark5_list, score_list) bounding_box_list_retinaface, face_landmark_5_list_retinaface, score_list_retinaface = detect_with_retinaface(vision_frame, facefusion.globals.face_detector_size)
if facefusion.globals.face_detector_model == 'yunet': bounding_box_list.extend(bounding_box_list_retinaface)
bounding_box_list, face_landmark5_list, score_list = detect_with_yunet(vision_frame, facefusion.globals.face_detector_size) face_landmark_5_list.extend(face_landmark_5_list_retinaface)
faces = create_faces(vision_frame, bounding_box_list, face_landmark5_list, score_list) score_list.extend(score_list_retinaface)
if facefusion.globals.face_detector_model in [ 'many', 'scrfd' ]:
bounding_box_list_scrfd, face_landmark_5_list_scrfd, score_list_scrfd = detect_with_scrfd(vision_frame, facefusion.globals.face_detector_size)
bounding_box_list.extend(bounding_box_list_scrfd)
face_landmark_5_list.extend(face_landmark_5_list_scrfd)
score_list.extend(score_list_scrfd)
if facefusion.globals.face_detector_model in [ 'many', 'yoloface' ]:
bounding_box_list_yoloface, face_landmark_5_list_yoloface, score_list_yoloface = detect_with_yoloface(vision_frame, facefusion.globals.face_detector_size)
bounding_box_list.extend(bounding_box_list_yoloface)
face_landmark_5_list.extend(face_landmark_5_list_yoloface)
score_list.extend(score_list_yoloface)
if facefusion.globals.face_detector_model in [ 'yunet' ]:
bounding_box_list_yunet, face_landmark_5_list_yunet, score_list_yunet = detect_with_yunet(vision_frame, facefusion.globals.face_detector_size)
bounding_box_list.extend(bounding_box_list_yunet)
face_landmark_5_list.extend(face_landmark_5_list_yunet)
score_list.extend(score_list_yunet)
if bounding_box_list and face_landmark_5_list and score_list:
faces = create_faces(vision_frame, bounding_box_list, face_landmark_5_list, score_list)
if faces: if faces:
set_static_faces(vision_frame, faces) set_static_faces(vision_frame, faces)
if facefusion.globals.face_analyser_order: if facefusion.globals.face_analyser_order:
@ -422,9 +513,9 @@ def sort_by_order(faces : List[Face], order : FaceAnalyserOrder) -> List[Face]:
if order == 'large-small': if order == 'large-small':
return sorted(faces, key = lambda face: (face.bounding_box[2] - face.bounding_box[0]) * (face.bounding_box[3] - face.bounding_box[1]), reverse = True) return sorted(faces, key = lambda face: (face.bounding_box[2] - face.bounding_box[0]) * (face.bounding_box[3] - face.bounding_box[1]), reverse = True)
if order == 'best-worst': if order == 'best-worst':
return sorted(faces, key = lambda face: face.score, reverse = True) return sorted(faces, key = lambda face: face.scores.get('detector'), reverse = True)
if order == 'worst-best': if order == 'worst-best':
return sorted(faces, key = lambda face: face.score) return sorted(faces, key = lambda face: face.scores.get('detector'))
return faces return faces

View File

@ -1,12 +1,12 @@
from typing import Any, Dict, Tuple, List from typing import Any, Tuple, List
from cv2.typing import Size from cv2.typing import Size
from functools import lru_cache from functools import lru_cache
import cv2 import cv2
import numpy import numpy
from facefusion.typing import BoundingBox, FaceLandmark5, FaceLandmark68, VisionFrame, Mask, Matrix, Translation, Template, FaceAnalyserAge, FaceAnalyserGender from facefusion.typing import BoundingBox, FaceLandmark5, FaceLandmark68, VisionFrame, Mask, Matrix, Translation, WarpTemplate, WarpTemplateSet, FaceAnalyserAge, FaceAnalyserGender
TEMPLATES : Dict[Template, numpy.ndarray[Any, Any]] =\ WARP_TEMPLATES : WarpTemplateSet =\
{ {
'arcface_112_v1': numpy.array( 'arcface_112_v1': numpy.array(
[ [
@ -43,9 +43,9 @@ TEMPLATES : Dict[Template, numpy.ndarray[Any, Any]] =\
} }
def warp_face_by_face_landmark_5(temp_vision_frame : VisionFrame, face_landmark_5 : FaceLandmark5, template : Template, crop_size : Size) -> Tuple[VisionFrame, Matrix]: def warp_face_by_face_landmark_5(temp_vision_frame : VisionFrame, face_landmark_5 : FaceLandmark5, warp_template : WarpTemplate, crop_size : Size) -> Tuple[VisionFrame, Matrix]:
normed_template = TEMPLATES.get(template) * crop_size normed_warp_template = WARP_TEMPLATES.get(warp_template) * crop_size
affine_matrix = cv2.estimateAffinePartial2D(face_landmark_5, normed_template, method = cv2.RANSAC, ransacReprojThreshold = 100)[0] affine_matrix = cv2.estimateAffinePartial2D(face_landmark_5, normed_warp_template, method = cv2.RANSAC, ransacReprojThreshold = 100)[0]
crop_vision_frame = cv2.warpAffine(temp_vision_frame, affine_matrix, crop_size, borderMode = cv2.BORDER_REPLICATE, flags = cv2.INTER_AREA) crop_vision_frame = cv2.warpAffine(temp_vision_frame, affine_matrix, crop_size, borderMode = cv2.BORDER_REPLICATE, flags = cv2.INTER_AREA)
return crop_vision_frame, affine_matrix return crop_vision_frame, affine_matrix
@ -89,7 +89,7 @@ def create_static_anchors(feature_stride : int, anchor_total : int, stride_heigh
return anchors return anchors
def create_bounding_box_from_landmark(face_landmark_68 : FaceLandmark68) -> BoundingBox: def create_bounding_box_from_face_landmark_68(face_landmark_68 : FaceLandmark68) -> BoundingBox:
min_x, min_y = numpy.min(face_landmark_68, axis = 0) min_x, min_y = numpy.min(face_landmark_68, axis = 0)
max_x, max_y = numpy.max(face_landmark_68, axis = 0) max_x, max_y = numpy.max(face_landmark_68, axis = 0)
bounding_box = numpy.array([ min_x, min_y, max_x, max_y ]).astype(numpy.int16) bounding_box = numpy.array([ min_x, min_y, max_x, max_y ]).astype(numpy.int16)
@ -113,12 +113,14 @@ def distance_to_face_landmark_5(points : numpy.ndarray[Any, Any], distance : num
def convert_face_landmark_68_to_5(landmark_68 : FaceLandmark68) -> FaceLandmark5: def convert_face_landmark_68_to_5(landmark_68 : FaceLandmark68) -> FaceLandmark5:
left_eye = numpy.mean(landmark_68[36:42], axis = 0) face_landmark_5 = numpy.array(
right_eye = numpy.mean(landmark_68[42:48], axis = 0) [
nose = landmark_68[30] numpy.mean(landmark_68[36:42], axis = 0),
left_mouth_end = landmark_68[48] numpy.mean(landmark_68[42:48], axis = 0),
right_mouth_end = landmark_68[54] landmark_68[30],
face_landmark_5 = numpy.array([ left_eye, right_eye, nose, left_mouth_end, right_mouth_end ]) landmark_68[48],
landmark_68[54]
])
return face_landmark_5 return face_landmark_5

View File

@ -8,7 +8,7 @@ import onnxruntime
import facefusion.globals import facefusion.globals
from facefusion.typing import FaceLandmark68, VisionFrame, Mask, Padding, FaceMaskRegion, ModelSet from facefusion.typing import FaceLandmark68, VisionFrame, Mask, Padding, FaceMaskRegion, ModelSet
from facefusion.execution_helper import apply_execution_provider_options from facefusion.execution import apply_execution_provider_options
from facefusion.filesystem import resolve_relative_path from facefusion.filesystem import resolve_relative_path
from facefusion.download import conditional_download from facefusion.download import conditional_download

View File

@ -1,57 +1,55 @@
from typing import List, Optional from typing import List, Optional
import subprocess import subprocess
import filetype
import facefusion.globals import facefusion.globals
from facefusion import logger from facefusion import process_manager
from facefusion.typing import OutputVideoPreset, Fps, AudioBuffer from facefusion.typing import OutputVideoPreset, Fps, AudioBuffer
from facefusion.filesystem import get_temp_frames_pattern, get_temp_output_video_path from facefusion.filesystem import get_temp_frames_pattern, get_temp_output_video_path
def run_ffmpeg(args : List[str]) -> bool: def run_ffmpeg(args : List[str]) -> bool:
commands = [ 'ffmpeg', '-hide_banner', '-loglevel', 'error' ] commands = [ 'ffmpeg', '-hide_banner', '-loglevel', 'quiet' ]
commands.extend(args) commands.extend(args)
process = subprocess.Popen(commands, stdout = subprocess.PIPE)
while process_manager.is_processing():
try: try:
subprocess.run(commands, stderr = subprocess.PIPE, check = True) return process.wait(timeout = 0.5) == 0
return True except subprocess.TimeoutExpired:
except subprocess.CalledProcessError as exception: continue
logger.debug(exception.stderr.decode().strip(), __name__.upper()) return process.returncode == 0
return False
def open_ffmpeg(args : List[str]) -> subprocess.Popen[bytes]: def open_ffmpeg(args : List[str]) -> subprocess.Popen[bytes]:
commands = [ 'ffmpeg', '-hide_banner', '-loglevel', 'error' ] commands = [ 'ffmpeg', '-hide_banner', '-loglevel', 'quiet' ]
commands.extend(args) commands.extend(args)
return subprocess.Popen(commands, stdin = subprocess.PIPE, stdout = subprocess.PIPE) return subprocess.Popen(commands, stdout = subprocess.PIPE)
def extract_frames(target_path : str, video_resolution : str, video_fps : Fps) -> bool: def extract_frames(target_path : str, temp_video_resolution : str, temp_video_fps : Fps) -> bool:
temp_frame_compression = round(31 - (facefusion.globals.temp_frame_quality * 0.31))
trim_frame_start = facefusion.globals.trim_frame_start trim_frame_start = facefusion.globals.trim_frame_start
trim_frame_end = facefusion.globals.trim_frame_end trim_frame_end = facefusion.globals.trim_frame_end
temp_frames_pattern = get_temp_frames_pattern(target_path, '%04d') temp_frames_pattern = get_temp_frames_pattern(target_path, '%04d')
commands = [ '-hwaccel', 'auto', '-i', target_path, '-q:v', str(temp_frame_compression), '-pix_fmt', 'rgb24' ] commands = [ '-hwaccel', 'auto', '-i', target_path, '-q:v', '0' ]
if trim_frame_start is not None and trim_frame_end is not None: if trim_frame_start is not None and trim_frame_end is not None:
commands.extend([ '-vf', 'trim=start_frame=' + str(trim_frame_start) + ':end_frame=' + str(trim_frame_end) + ',scale=' + str(video_resolution) + ',fps=' + str(video_fps) ]) commands.extend([ '-vf', 'trim=start_frame=' + str(trim_frame_start) + ':end_frame=' + str(trim_frame_end) + ',scale=' + str(temp_video_resolution) + ',fps=' + str(temp_video_fps) ])
elif trim_frame_start is not None: elif trim_frame_start is not None:
commands.extend([ '-vf', 'trim=start_frame=' + str(trim_frame_start) + ',scale=' + str(video_resolution) + ',fps=' + str(video_fps) ]) commands.extend([ '-vf', 'trim=start_frame=' + str(trim_frame_start) + ',scale=' + str(temp_video_resolution) + ',fps=' + str(temp_video_fps) ])
elif trim_frame_end is not None: elif trim_frame_end is not None:
commands.extend([ '-vf', 'trim=end_frame=' + str(trim_frame_end) + ',scale=' + str(video_resolution) + ',fps=' + str(video_fps) ]) commands.extend([ '-vf', 'trim=end_frame=' + str(trim_frame_end) + ',scale=' + str(temp_video_resolution) + ',fps=' + str(temp_video_fps) ])
else: else:
commands.extend([ '-vf', 'scale=' + str(video_resolution) + ',fps=' + str(video_fps) ]) commands.extend([ '-vf', 'scale=' + str(temp_video_resolution) + ',fps=' + str(temp_video_fps) ])
commands.extend([ '-vsync', '0', temp_frames_pattern ]) commands.extend([ '-vsync', '0', temp_frames_pattern ])
return run_ffmpeg(commands) return run_ffmpeg(commands)
def compress_image(output_path : str) -> bool: def merge_video(target_path : str, output_video_resolution : str, output_video_fps : Fps) -> bool:
output_image_compression = round(31 - (facefusion.globals.output_image_quality * 0.31))
commands = [ '-hwaccel', 'auto', '-i', output_path, '-q:v', str(output_image_compression), '-y', output_path ]
return run_ffmpeg(commands)
def merge_video(target_path : str, video_resolution : str, video_fps : Fps) -> bool:
temp_output_video_path = get_temp_output_video_path(target_path) temp_output_video_path = get_temp_output_video_path(target_path)
temp_frames_pattern = get_temp_frames_pattern(target_path, '%04d') temp_frames_pattern = get_temp_frames_pattern(target_path, '%04d')
commands = [ '-hwaccel', 'auto', '-s', str(video_resolution), '-r', str(video_fps), '-i', temp_frames_pattern, '-c:v', facefusion.globals.output_video_encoder ] commands = [ '-hwaccel', 'auto', '-s', str(output_video_resolution), '-r', str(output_video_fps), '-i', temp_frames_pattern, '-c:v', facefusion.globals.output_video_encoder ]
if facefusion.globals.output_video_encoder in [ 'libx264', 'libx265' ]: if facefusion.globals.output_video_encoder in [ 'libx264', 'libx265' ]:
output_video_compression = round(51 - (facefusion.globals.output_video_quality * 0.51)) output_video_compression = round(51 - (facefusion.globals.output_video_quality * 0.51))
commands.extend([ '-crf', str(output_video_compression), '-preset', facefusion.globals.output_video_preset ]) commands.extend([ '-crf', str(output_video_compression), '-preset', facefusion.globals.output_video_preset ])
@ -61,29 +59,46 @@ def merge_video(target_path : str, video_resolution : str, video_fps : Fps) -> b
if facefusion.globals.output_video_encoder in [ 'h264_nvenc', 'hevc_nvenc' ]: if facefusion.globals.output_video_encoder in [ 'h264_nvenc', 'hevc_nvenc' ]:
output_video_compression = round(51 - (facefusion.globals.output_video_quality * 0.51)) output_video_compression = round(51 - (facefusion.globals.output_video_quality * 0.51))
commands.extend([ '-cq', str(output_video_compression), '-preset', map_nvenc_preset(facefusion.globals.output_video_preset) ]) commands.extend([ '-cq', str(output_video_compression), '-preset', map_nvenc_preset(facefusion.globals.output_video_preset) ])
if facefusion.globals.output_video_encoder in [ 'h264_amf', 'hevc_amf' ]:
output_video_compression = round(51 - (facefusion.globals.output_video_quality * 0.51))
commands.extend([ '-qp_i', str(output_video_compression), '-qp_p', str(output_video_compression), '-quality', map_amf_preset(facefusion.globals.output_video_preset) ])
commands.extend([ '-pix_fmt', 'yuv420p', '-colorspace', 'bt709', '-y', temp_output_video_path ]) commands.extend([ '-pix_fmt', 'yuv420p', '-colorspace', 'bt709', '-y', temp_output_video_path ])
return run_ffmpeg(commands) return run_ffmpeg(commands)
def read_audio_buffer(target_path : str, sample_rate : int, channel_total : int) -> Optional[AudioBuffer]: def copy_image(target_path : str, output_path : str, temp_image_resolution : str) -> bool:
commands = [ '-i', target_path, '-vn', '-f', 's16le', '-acodec', 'pcm_s16le', '-ar', str(sample_rate), '-ac', str(channel_total), '-' ] is_webp = filetype.guess_mime(target_path) == 'image/webp'
temp_image_compression = 100 if is_webp else 0
commands = [ '-i', target_path, '-q:v', str(temp_image_compression), '-vf', 'scale=' + str(temp_image_resolution), '-y', output_path ]
return run_ffmpeg(commands)
def finalize_image(output_path : str, output_image_resolution : str) -> bool:
output_image_compression = round(31 - (facefusion.globals.output_image_quality * 0.31))
commands = [ '-i', output_path, '-q:v', str(output_image_compression), '-vf', 'scale=' + str(output_image_resolution), '-y', output_path ]
return run_ffmpeg(commands)
def read_audio_buffer(target_path : str, sample_rate : int, total_channel : int) -> Optional[AudioBuffer]:
commands = [ '-i', target_path, '-vn', '-f', 's16le', '-acodec', 'pcm_s16le', '-ar', str(sample_rate), '-ac', str(total_channel), '-' ]
process = open_ffmpeg(commands) process = open_ffmpeg(commands)
audio_buffer, error = process.communicate() audio_buffer, _ = process.communicate()
if process.returncode == 0: if process.returncode == 0:
return audio_buffer return audio_buffer
return None return None
def restore_audio(target_path : str, output_path : str, video_fps : Fps) -> bool: def restore_audio(target_path : str, output_path : str, output_video_fps : Fps) -> bool:
trim_frame_start = facefusion.globals.trim_frame_start trim_frame_start = facefusion.globals.trim_frame_start
trim_frame_end = facefusion.globals.trim_frame_end trim_frame_end = facefusion.globals.trim_frame_end
temp_output_video_path = get_temp_output_video_path(target_path) temp_output_video_path = get_temp_output_video_path(target_path)
commands = [ '-hwaccel', 'auto', '-i', temp_output_video_path ] commands = [ '-hwaccel', 'auto', '-i', temp_output_video_path ]
if trim_frame_start is not None: if trim_frame_start is not None:
start_time = trim_frame_start / video_fps start_time = trim_frame_start / output_video_fps
commands.extend([ '-ss', str(start_time) ]) commands.extend([ '-ss', str(start_time) ])
if trim_frame_end is not None: if trim_frame_end is not None:
end_time = trim_frame_end / video_fps end_time = trim_frame_end / output_video_fps
commands.extend([ '-to', str(end_time) ]) commands.extend([ '-to', str(end_time) ])
commands.extend([ '-i', target_path, '-c', 'copy', '-map', '0:v:0', '-map', '1:a:0', '-shortest', '-y', output_path ]) commands.extend([ '-i', target_path, '-c', 'copy', '-map', '0:v:0', '-map', '1:a:0', '-shortest', '-y', output_path ])
return run_ffmpeg(commands) return run_ffmpeg(commands)
@ -111,3 +126,13 @@ def map_nvenc_preset(output_video_preset : OutputVideoPreset) -> Optional[str]:
if output_video_preset == 'veryslow': if output_video_preset == 'veryslow':
return 'p7' return 'p7'
return None return None
def map_amf_preset(output_video_preset : OutputVideoPreset) -> Optional[str]:
if output_video_preset in [ 'ultrafast', 'superfast', 'veryfast' ]:
return 'speed'
if output_video_preset in [ 'faster', 'fast', 'medium' ]:
return 'balanced'
if output_video_preset in [ 'slow', 'slower', 'veryslow' ]:
return 'quality'
return None

View File

@ -49,7 +49,7 @@ def clear_temp(target_path : str) -> None:
temp_directory_path = get_temp_directory_path(target_path) temp_directory_path = get_temp_directory_path(target_path)
parent_directory_path = os.path.dirname(temp_directory_path) parent_directory_path = os.path.dirname(temp_directory_path)
if not facefusion.globals.keep_temp and is_directory(temp_directory_path): if not facefusion.globals.keep_temp and is_directory(temp_directory_path):
shutil.rmtree(temp_directory_path) shutil.rmtree(temp_directory_path, ignore_errors = True)
if os.path.exists(parent_directory_path) and not os.listdir(parent_directory_path): if os.path.exists(parent_directory_path) and not os.listdir(parent_directory_path):
os.rmdir(parent_directory_path) os.rmdir(parent_directory_path)

View File

@ -24,6 +24,7 @@ face_analyser_gender : Optional[FaceAnalyserGender] = None
face_detector_model : Optional[FaceDetectorModel] = None face_detector_model : Optional[FaceDetectorModel] = None
face_detector_size : Optional[str] = None face_detector_size : Optional[str] = None
face_detector_score : Optional[float] = None face_detector_score : Optional[float] = None
face_landmarker_score : Optional[float] = None
face_recognizer_model : Optional[FaceRecognizerModel] = None face_recognizer_model : Optional[FaceRecognizerModel] = None
# face selector # face selector
face_selector_mode : Optional[FaceSelectorMode] = None face_selector_mode : Optional[FaceSelectorMode] = None
@ -39,10 +40,10 @@ face_mask_regions : Optional[List[FaceMaskRegion]] = None
trim_frame_start : Optional[int] = None trim_frame_start : Optional[int] = None
trim_frame_end : Optional[int] = None trim_frame_end : Optional[int] = None
temp_frame_format : Optional[TempFrameFormat] = None temp_frame_format : Optional[TempFrameFormat] = None
temp_frame_quality : Optional[int] = None
keep_temp : Optional[bool] = None keep_temp : Optional[bool] = None
# output creation # output creation
output_image_quality : Optional[int] = None output_image_quality : Optional[int] = None
output_image_resolution : Optional[str] = None
output_video_encoder : Optional[OutputVideoEncoder] = None output_video_encoder : Optional[OutputVideoEncoder] = None
output_video_preset : Optional[OutputVideoPreset] = None output_video_preset : Optional[OutputVideoPreset] = None
output_video_quality : Optional[int] = None output_video_quality : Optional[int] = None

View File

@ -9,26 +9,17 @@ from argparse import ArgumentParser, HelpFormatter
from facefusion import metadata, wording from facefusion import metadata, wording
TORCH : Dict[str, str] =\
{
'default': 'default',
'cpu': 'cpu'
}
ONNXRUNTIMES : Dict[str, Tuple[str, str]] = {} ONNXRUNTIMES : Dict[str, Tuple[str, str]] = {}
if platform.system().lower() == 'darwin': if platform.system().lower() == 'darwin':
ONNXRUNTIMES['default'] = ('onnxruntime', '1.17.0') ONNXRUNTIMES['default'] = ('onnxruntime', '1.17.1')
else: else:
ONNXRUNTIMES['default'] = ('onnxruntime', '1.16.3') ONNXRUNTIMES['default'] = ('onnxruntime', '1.16.3')
if platform.system().lower() == 'linux' or platform.system().lower() == 'windows': if platform.system().lower() == 'linux' or platform.system().lower() == 'windows':
TORCH['cuda-12.1'] = 'cu121' ONNXRUNTIMES['cuda-12.2'] = ('onnxruntime-gpu', '1.17.1')
TORCH['cuda-11.8'] = 'cu118'
ONNXRUNTIMES['cuda-12.1'] = ('onnxruntime-gpu', '1.17.0')
ONNXRUNTIMES['cuda-11.8'] = ('onnxruntime-gpu', '1.16.3') ONNXRUNTIMES['cuda-11.8'] = ('onnxruntime-gpu', '1.16.3')
ONNXRUNTIMES['openvino'] = ('onnxruntime-openvino', '1.16.0') ONNXRUNTIMES['openvino'] = ('onnxruntime-openvino', '1.16.0')
if platform.system().lower() == 'linux': if platform.system().lower() == 'linux':
TORCH['rocm-5.4.2'] = 'rocm5.4.2'
TORCH['rocm-5.6'] = 'rocm5.6'
ONNXRUNTIMES['rocm-5.4.2'] = ('onnxruntime-rocm', '1.16.3') ONNXRUNTIMES['rocm-5.4.2'] = ('onnxruntime-rocm', '1.16.3')
ONNXRUNTIMES['rocm-5.6'] = ('onnxruntime-rocm', '1.16.3') ONNXRUNTIMES['rocm-5.6'] = ('onnxruntime-rocm', '1.16.3')
if platform.system().lower() == 'windows': if platform.system().lower() == 'windows':
@ -37,7 +28,6 @@ if platform.system().lower() == 'windows':
def cli() -> None: def cli() -> None:
program = ArgumentParser(formatter_class = lambda prog: HelpFormatter(prog, max_help_position = 130)) program = ArgumentParser(formatter_class = lambda prog: HelpFormatter(prog, max_help_position = 130))
program.add_argument('--torch', help = wording.get('help.install_dependency').format(dependency = 'torch'), choices = TORCH.keys())
program.add_argument('--onnxruntime', help = wording.get('help.install_dependency').format(dependency = 'onnxruntime'), choices = ONNXRUNTIMES.keys()) program.add_argument('--onnxruntime', help = wording.get('help.install_dependency').format(dependency = 'onnxruntime'), choices = ONNXRUNTIMES.keys())
program.add_argument('--skip-venv', help = wording.get('help.skip_venv'), action = 'store_true') program.add_argument('--skip-venv', help = wording.get('help.skip_venv'), action = 'store_true')
program.add_argument('-v', '--version', version = metadata.get('name') + ' ' + metadata.get('version'), action = 'version') program.add_argument('-v', '--version', version = metadata.get('name') + ' ' + metadata.get('version'), action = 'version')
@ -52,29 +42,21 @@ def run(program : ArgumentParser) -> None:
os.environ['SYSTEM_VERSION_COMPAT'] = '0' os.environ['SYSTEM_VERSION_COMPAT'] = '0'
if not args.skip_venv: if not args.skip_venv:
os.environ['PIP_REQUIRE_VIRTUALENV'] = '1' os.environ['PIP_REQUIRE_VIRTUALENV'] = '1'
if args.torch and args.onnxruntime: if args.onnxruntime:
answers =\ answers =\
{ {
'torch': args.torch,
'onnxruntime': args.onnxruntime 'onnxruntime': args.onnxruntime
} }
else: else:
answers = inquirer.prompt( answers = inquirer.prompt(
[ [
inquirer.List('torch', message = wording.get('help.install_dependency').format(dependency = 'torch'), choices = list(TORCH.keys())),
inquirer.List('onnxruntime', message = wording.get('help.install_dependency').format(dependency = 'onnxruntime'), choices = list(ONNXRUNTIMES.keys())) inquirer.List('onnxruntime', message = wording.get('help.install_dependency').format(dependency = 'onnxruntime'), choices = list(ONNXRUNTIMES.keys()))
]) ])
if answers: if answers:
torch = answers['torch']
torch_wheel = TORCH[torch]
onnxruntime = answers['onnxruntime'] onnxruntime = answers['onnxruntime']
onnxruntime_name, onnxruntime_version = ONNXRUNTIMES[onnxruntime] onnxruntime_name, onnxruntime_version = ONNXRUNTIMES[onnxruntime]
subprocess.call([ 'pip', 'uninstall', 'torch', '-y', '-q' ])
if torch_wheel == 'default':
subprocess.call([ 'pip', 'install', '-r', 'requirements.txt', '--force-reinstall' ]) subprocess.call([ 'pip', 'install', '-r', 'requirements.txt', '--force-reinstall' ])
else:
subprocess.call([ 'pip', 'install', '-r', 'requirements.txt', '--extra-index-url', 'https://download.pytorch.org/whl/' + torch_wheel, '--force-reinstall' ])
if onnxruntime == 'rocm-5.4.2' or onnxruntime == 'rocm-5.6': if onnxruntime == 'rocm-5.4.2' or onnxruntime == 'rocm-5.6':
if python_id in [ 'cp39', 'cp310', 'cp311' ]: if python_id in [ 'cp39', 'cp310', 'cp311' ]:
rocm_version = onnxruntime.replace('-', '') rocm_version = onnxruntime.replace('-', '')

View File

@ -2,7 +2,7 @@ METADATA =\
{ {
'name': 'FaceFusion', 'name': 'FaceFusion',
'description': 'Next generation face swapper and enhancer', 'description': 'Next generation face swapper and enhancer',
'version': '2.3.0', 'version': '2.4.0',
'license': 'MIT', 'license': 'MIT',
'author': 'Henry Ruhs', 'author': 'Henry Ruhs',
'url': 'https://facefusion.io' 'url': 'https://facefusion.io'

View File

@ -1,25 +1,24 @@
from typing import List, Optional from typing import List, Optional
import hashlib
import os import os
from facefusion.filesystem import is_file, is_directory import facefusion.globals
from facefusion.filesystem import is_directory
from facefusion.typing import Padding, Fps from facefusion.typing import Padding, Fps
def normalize_output_path(source_paths : List[str], target_path : str, output_path : str) -> Optional[str]: def normalize_output_path(target_path : Optional[str], output_path : Optional[str]) -> Optional[str]:
if is_file(target_path) and is_directory(output_path): if target_path and output_path:
target_name, target_extension = os.path.splitext(os.path.basename(target_path)) target_name, target_extension = os.path.splitext(os.path.basename(target_path))
if source_paths and is_file(source_paths[0]): if is_directory(output_path):
source_name, _ = os.path.splitext(os.path.basename(source_paths[0])) output_hash = hashlib.sha1(str(facefusion.globals.__dict__).encode('utf-8')).hexdigest()[:8]
return os.path.join(output_path, source_name + '-' + target_name + target_extension) output_name = target_name + '-' + output_hash
return os.path.join(output_path, target_name + target_extension) return os.path.join(output_path, output_name + target_extension)
if is_file(target_path) and output_path:
_, target_extension = os.path.splitext(os.path.basename(target_path))
output_name, output_extension = os.path.splitext(os.path.basename(output_path)) output_name, output_extension = os.path.splitext(os.path.basename(output_path))
output_directory_path = os.path.dirname(output_path) output_directory_path = os.path.dirname(output_path)
if is_directory(output_directory_path) and output_extension: if is_directory(output_directory_path) and output_extension:
return os.path.join(output_directory_path, output_name + target_extension) return os.path.join(output_directory_path, output_name + target_extension)
return None return None
return output_path
def normalize_padding(padding : Optional[List[int]]) -> Optional[Padding]: def normalize_padding(padding : Optional[List[int]]) -> Optional[Padding]:

View File

@ -0,0 +1,45 @@
from typing import Generator, List
from facefusion.typing import QueuePayload, ProcessState
PROCESS_STATE : ProcessState = 'pending'
def get_process_state() -> ProcessState:
return PROCESS_STATE
def set_process_state(process_state : ProcessState) -> None:
global PROCESS_STATE
PROCESS_STATE = process_state
def is_processing() -> bool:
return get_process_state() == 'processing'
def is_stopping() -> bool:
return get_process_state() == 'stopping'
def is_pending() -> bool:
return get_process_state() == 'pending'
def start() -> None:
set_process_state('processing')
def stop() -> None:
set_process_state('stopping')
def end() -> None:
set_process_state('pending')
def manage(queue_payloads : List[QueuePayload]) -> Generator[QueuePayload, None, None]:
for query_payload in queue_payloads:
if is_processing():
yield query_payload

View File

@ -3,10 +3,10 @@ from typing import List
from facefusion.common_helper import create_int_range from facefusion.common_helper import create_int_range
from facefusion.processors.frame.typings import FaceDebuggerItem, FaceEnhancerModel, FaceSwapperModel, FrameEnhancerModel, LipSyncerModel from facefusion.processors.frame.typings import FaceDebuggerItem, FaceEnhancerModel, FaceSwapperModel, FrameEnhancerModel, LipSyncerModel
face_debugger_items : List[FaceDebuggerItem] = [ 'bounding-box', 'landmark-5', 'landmark-68', 'face-mask', 'score', 'age', 'gender' ] face_debugger_items : List[FaceDebuggerItem] = [ 'bounding-box', 'face-landmark-5', 'face-landmark-5/68', 'face-landmark-68', 'face-mask', 'face-detector-score', 'face-landmarker-score', 'age', 'gender' ]
face_enhancer_models : List[FaceEnhancerModel] = [ 'codeformer', 'gfpgan_1.2', 'gfpgan_1.3', 'gfpgan_1.4', 'gpen_bfr_256', 'gpen_bfr_512', 'restoreformer_plus_plus' ] face_enhancer_models : List[FaceEnhancerModel] = [ 'codeformer', 'gfpgan_1.2', 'gfpgan_1.3', 'gfpgan_1.4', 'gpen_bfr_256', 'gpen_bfr_512', 'restoreformer_plus_plus' ]
face_swapper_models : List[FaceSwapperModel] = [ 'blendswap_256', 'inswapper_128', 'inswapper_128_fp16', 'simswap_256', 'simswap_512_unofficial', 'uniface_256' ] face_swapper_models : List[FaceSwapperModel] = [ 'blendswap_256', 'inswapper_128', 'inswapper_128_fp16', 'simswap_256', 'simswap_512_unofficial', 'uniface_256' ]
frame_enhancer_models : List[FrameEnhancerModel] = [ 'real_esrgan_x2plus', 'real_esrgan_x4plus', 'real_esrnet_x4plus' ] frame_enhancer_models : List[FrameEnhancerModel] = [ 'lsdir_x4', 'nomos8k_sc_x4', 'real_esrgan_x4', 'real_esrgan_x4_fp16', 'span_kendata_x4' ]
lip_syncer_models : List[LipSyncerModel] = [ 'wav2lip_gan' ] lip_syncer_models : List[LipSyncerModel] = [ 'wav2lip_gan' ]
face_enhancer_blend_range : List[int] = create_int_range(0, 100, 1) face_enhancer_blend_range : List[int] = create_int_range(0, 100, 1)

View File

@ -8,8 +8,8 @@ from typing import Any, List
from tqdm import tqdm from tqdm import tqdm
import facefusion.globals import facefusion.globals
from facefusion.typing import Process_Frames, QueuePayload from facefusion.typing import ProcessFrames, QueuePayload
from facefusion.execution_helper import encode_execution_providers from facefusion.execution import encode_execution_providers
from facefusion import logger, wording from facefusion import logger, wording
FRAME_PROCESSORS_MODULES : List[ModuleType] = [] FRAME_PROCESSORS_MODULES : List[ModuleType] = []
@ -67,7 +67,7 @@ def clear_frame_processors_modules() -> None:
FRAME_PROCESSORS_MODULES = [] FRAME_PROCESSORS_MODULES = []
def multi_process_frames(source_paths : List[str], temp_frame_paths : List[str], process_frames : Process_Frames) -> None: def multi_process_frames(source_paths : List[str], temp_frame_paths : List[str], process_frames : ProcessFrames) -> None:
queue_payloads = create_queue_payloads(temp_frame_paths) queue_payloads = create_queue_payloads(temp_frame_paths)
with tqdm(total = len(queue_payloads), desc = wording.get('processing'), unit = 'frame', ascii = ' =', disable = facefusion.globals.log_level in [ 'warn', 'error' ]) as progress: with tqdm(total = len(queue_payloads), desc = wording.get('processing'), unit = 'frame', ascii = ' =', disable = facefusion.globals.log_level in [ 'warn', 'error' ]) as progress:
progress.set_postfix( progress.set_postfix(

View File

@ -5,13 +5,13 @@ import numpy
import facefusion.globals import facefusion.globals
import facefusion.processors.frame.core as frame_processors import facefusion.processors.frame.core as frame_processors
from facefusion import config, wording from facefusion import config, process_manager, wording
from facefusion.face_analyser import get_one_face, get_many_faces, find_similar_faces, clear_face_analyser from facefusion.face_analyser import get_one_face, get_many_faces, find_similar_faces, clear_face_analyser
from facefusion.face_masker import create_static_box_mask, create_occlusion_mask, create_region_mask, clear_face_occluder, clear_face_parser from facefusion.face_masker import create_static_box_mask, create_occlusion_mask, create_region_mask, clear_face_occluder, clear_face_parser
from facefusion.face_helper import warp_face_by_face_landmark_5, categorize_age, categorize_gender from facefusion.face_helper import warp_face_by_face_landmark_5, categorize_age, categorize_gender
from facefusion.face_store import get_reference_faces from facefusion.face_store import get_reference_faces
from facefusion.content_analyser import clear_content_analyser from facefusion.content_analyser import clear_content_analyser
from facefusion.typing import Face, VisionFrame, Update_Process, ProcessMode, QueuePayload from facefusion.typing import Face, VisionFrame, UpdateProcess, ProcessMode, QueuePayload
from facefusion.vision import read_image, read_static_image, write_image from facefusion.vision import read_image, read_static_image, write_image
from facefusion.processors.frame.typings import FaceDebuggerInputs from facefusion.processors.frame.typings import FaceDebuggerInputs
from facefusion.processors.frame import globals as frame_processors_globals, choices as frame_processors_choices from facefusion.processors.frame import globals as frame_processors_globals, choices as frame_processors_choices
@ -36,7 +36,7 @@ def set_options(key : Literal['model'], value : Any) -> None:
def register_args(program : ArgumentParser) -> None: def register_args(program : ArgumentParser) -> None:
program.add_argument('--face-debugger-items', help = wording.get('help.face_debugger_items').format(choices = ', '.join(frame_processors_choices.face_debugger_items)), default = config.get_str_list('frame_processors.face_debugger_items', 'landmark-5 face-mask'), choices = frame_processors_choices.face_debugger_items, nargs = '+', metavar = 'FACE_DEBUGGER_ITEMS') program.add_argument('--face-debugger-items', help = wording.get('help.face_debugger_items').format(choices = ', '.join(frame_processors_choices.face_debugger_items)), default = config.get_str_list('frame_processors.face_debugger_items', 'face-landmark-5 face-mask'), choices = frame_processors_choices.face_debugger_items, nargs = '+', metavar = 'FACE_DEBUGGER_ITEMS')
def apply_args(program : ArgumentParser) -> None: def apply_args(program : ArgumentParser) -> None:
@ -70,13 +70,15 @@ def post_process() -> None:
def debug_face(target_face : Face, temp_vision_frame : VisionFrame) -> VisionFrame: def debug_face(target_face : Face, temp_vision_frame : VisionFrame) -> VisionFrame:
primary_color = (0, 0, 255) primary_color = (0, 0, 255)
secondary_color = (0, 255, 0) secondary_color = (0, 255, 0)
tertiary_color = (255, 255, 0)
bounding_box = target_face.bounding_box.astype(numpy.int32) bounding_box = target_face.bounding_box.astype(numpy.int32)
temp_vision_frame = temp_vision_frame.copy() temp_vision_frame = temp_vision_frame.copy()
has_face_landmark_5_fallback = numpy.array_equal(target_face.landmarks.get('5'), target_face.landmarks.get('5/68'))
if 'bounding-box' in frame_processors_globals.face_debugger_items: if 'bounding-box' in frame_processors_globals.face_debugger_items:
cv2.rectangle(temp_vision_frame, (bounding_box[0], bounding_box[1]), (bounding_box[2], bounding_box[3]), secondary_color, 2) cv2.rectangle(temp_vision_frame, (bounding_box[0], bounding_box[1]), (bounding_box[2], bounding_box[3]), primary_color, 2)
if 'face-mask' in frame_processors_globals.face_debugger_items: if 'face-mask' in frame_processors_globals.face_debugger_items:
crop_vision_frame, affine_matrix = warp_face_by_face_landmark_5(temp_vision_frame, target_face.landmark['5/68'], 'arcface_128_v2', (512, 512)) crop_vision_frame, affine_matrix = warp_face_by_face_landmark_5(temp_vision_frame, target_face.landmarks.get('5/68'), 'arcface_128_v2', (512, 512))
inverse_matrix = cv2.invertAffineTransform(affine_matrix) inverse_matrix = cv2.invertAffineTransform(affine_matrix)
temp_size = temp_vision_frame.shape[:2][::-1] temp_size = temp_vision_frame.shape[:2][::-1]
crop_mask_list = [] crop_mask_list = []
@ -95,30 +97,38 @@ def debug_face(target_face : Face, temp_vision_frame : VisionFrame) -> VisionFra
inverse_vision_frame = cv2.threshold(inverse_vision_frame, 100, 255, cv2.THRESH_BINARY)[1] inverse_vision_frame = cv2.threshold(inverse_vision_frame, 100, 255, cv2.THRESH_BINARY)[1]
inverse_vision_frame[inverse_vision_frame > 0] = 255 inverse_vision_frame[inverse_vision_frame > 0] = 255
inverse_contours = cv2.findContours(inverse_vision_frame, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE)[0] inverse_contours = cv2.findContours(inverse_vision_frame, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE)[0]
cv2.drawContours(temp_vision_frame, inverse_contours, -1, primary_color, 2) cv2.drawContours(temp_vision_frame, inverse_contours, -1, tertiary_color if has_face_landmark_5_fallback else secondary_color, 2)
if bounding_box[3] - bounding_box[1] > 60 and bounding_box[2] - bounding_box[0] > 60: if 'face-landmark-5' in frame_processors_globals.face_debugger_items and numpy.any(target_face.landmarks.get('5')):
top = bounding_box[1] face_landmark_5 = target_face.landmarks.get('5').astype(numpy.int32)
left = bounding_box[0] + 20
if 'landmark-5' in frame_processors_globals.face_debugger_items:
face_landmark_5 = target_face.landmark['5/68'].astype(numpy.int32)
for index in range(face_landmark_5.shape[0]): for index in range(face_landmark_5.shape[0]):
cv2.circle(temp_vision_frame, (face_landmark_5[index][0], face_landmark_5[index][1]), 3, primary_color, -1) cv2.circle(temp_vision_frame, (face_landmark_5[index][0], face_landmark_5[index][1]), 3, primary_color, -1)
if 'landmark-68' in frame_processors_globals.face_debugger_items: if 'face-landmark-5/68' in frame_processors_globals.face_debugger_items and numpy.any(target_face.landmarks.get('5/68')):
face_landmark_68 = target_face.landmark['68'].astype(numpy.int32) face_landmark_5_68 = target_face.landmarks.get('5/68').astype(numpy.int32)
for index in range(face_landmark_5_68.shape[0]):
cv2.circle(temp_vision_frame, (face_landmark_5_68[index][0], face_landmark_5_68[index][1]), 3, tertiary_color if has_face_landmark_5_fallback else secondary_color, -1)
if 'face-landmark-68' in frame_processors_globals.face_debugger_items and numpy.any(target_face.landmarks.get('68')):
face_landmark_68 = target_face.landmarks.get('68').astype(numpy.int32)
for index in range(face_landmark_68.shape[0]): for index in range(face_landmark_68.shape[0]):
cv2.circle(temp_vision_frame, (face_landmark_68[index][0], face_landmark_68[index][1]), 3, secondary_color, -1) cv2.circle(temp_vision_frame, (face_landmark_68[index][0], face_landmark_68[index][1]), 3, secondary_color, -1)
if 'score' in frame_processors_globals.face_debugger_items: if bounding_box[3] - bounding_box[1] > 50 and bounding_box[2] - bounding_box[0] > 50:
face_score_text = str(round(target_face.score, 2)) top = bounding_box[1]
left = bounding_box[0] - 20
if 'face-detector-score' in frame_processors_globals.face_debugger_items:
face_score_text = str(round(target_face.scores.get('detector'), 2))
top = top + 20 top = top + 20
cv2.putText(temp_vision_frame, face_score_text, (left, top), cv2.FONT_HERSHEY_SIMPLEX, 0.5, secondary_color, 2) cv2.putText(temp_vision_frame, face_score_text, (left, top), cv2.FONT_HERSHEY_SIMPLEX, 0.5, primary_color, 2)
if 'face-landmarker-score' in frame_processors_globals.face_debugger_items:
face_score_text = str(round(target_face.scores.get('landmarker'), 2))
top = top + 20
cv2.putText(temp_vision_frame, face_score_text, (left, top), cv2.FONT_HERSHEY_SIMPLEX, 0.5, tertiary_color if has_face_landmark_5_fallback else secondary_color, 2)
if 'age' in frame_processors_globals.face_debugger_items: if 'age' in frame_processors_globals.face_debugger_items:
face_age_text = categorize_age(target_face.age) face_age_text = categorize_age(target_face.age)
top = top + 20 top = top + 20
cv2.putText(temp_vision_frame, face_age_text, (left, top), cv2.FONT_HERSHEY_SIMPLEX, 0.5, secondary_color, 2) cv2.putText(temp_vision_frame, face_age_text, (left, top), cv2.FONT_HERSHEY_SIMPLEX, 0.5, primary_color, 2)
if 'gender' in frame_processors_globals.face_debugger_items: if 'gender' in frame_processors_globals.face_debugger_items:
face_gender_text = categorize_gender(target_face.gender) face_gender_text = categorize_gender(target_face.gender)
top = top + 20 top = top + 20
cv2.putText(temp_vision_frame, face_gender_text, (left, top), cv2.FONT_HERSHEY_SIMPLEX, 0.5, secondary_color, 2) cv2.putText(temp_vision_frame, face_gender_text, (left, top), cv2.FONT_HERSHEY_SIMPLEX, 0.5, primary_color, 2)
return temp_vision_frame return temp_vision_frame
@ -127,50 +137,50 @@ def get_reference_frame(source_face : Face, target_face : Face, temp_vision_fram
def process_frame(inputs : FaceDebuggerInputs) -> VisionFrame: def process_frame(inputs : FaceDebuggerInputs) -> VisionFrame:
reference_faces = inputs['reference_faces'] reference_faces = inputs.get('reference_faces')
target_vision_frame = inputs['target_vision_frame'] target_vision_frame = inputs.get('target_vision_frame')
if 'reference' in facefusion.globals.face_selector_mode: if facefusion.globals.face_selector_mode == 'many':
similar_faces = find_similar_faces(reference_faces, target_vision_frame, facefusion.globals.reference_face_distance)
if similar_faces:
for similar_face in similar_faces:
target_vision_frame = debug_face(similar_face, target_vision_frame)
if 'one' in facefusion.globals.face_selector_mode:
target_face = get_one_face(target_vision_frame)
if target_face:
target_vision_frame = debug_face(target_face, target_vision_frame)
if 'many' in facefusion.globals.face_selector_mode:
many_faces = get_many_faces(target_vision_frame) many_faces = get_many_faces(target_vision_frame)
if many_faces: if many_faces:
for target_face in many_faces: for target_face in many_faces:
target_vision_frame = debug_face(target_face, target_vision_frame) target_vision_frame = debug_face(target_face, target_vision_frame)
if facefusion.globals.face_selector_mode == 'one':
target_face = get_one_face(target_vision_frame)
if target_face:
target_vision_frame = debug_face(target_face, target_vision_frame)
if facefusion.globals.face_selector_mode == 'reference':
similar_faces = find_similar_faces(reference_faces, target_vision_frame, facefusion.globals.reference_face_distance)
if similar_faces:
for similar_face in similar_faces:
target_vision_frame = debug_face(similar_face, target_vision_frame)
return target_vision_frame return target_vision_frame
def process_frames(source_paths : List[str], queue_payloads : List[QueuePayload], update_progress : Update_Process) -> None: def process_frames(source_paths : List[str], queue_payloads : List[QueuePayload], update_progress : UpdateProcess) -> None:
reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None
for queue_payload in queue_payloads: for queue_payload in process_manager.manage(queue_payloads):
target_vision_path = queue_payload['frame_path'] target_vision_path = queue_payload['frame_path']
target_vision_frame = read_image(target_vision_path) target_vision_frame = read_image(target_vision_path)
result_frame = process_frame( output_vision_frame = process_frame(
{ {
'reference_faces': reference_faces, 'reference_faces': reference_faces,
'target_vision_frame': target_vision_frame 'target_vision_frame': target_vision_frame
}) })
write_image(target_vision_path, result_frame) write_image(target_vision_path, output_vision_frame)
update_progress() update_progress()
def process_image(source_paths : List[str], target_path : str, output_path : str) -> None: def process_image(source_paths : List[str], target_path : str, output_path : str) -> None:
reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None
target_vision_frame = read_static_image(target_path) target_vision_frame = read_static_image(target_path)
result_frame = process_frame( output_vision_frame = process_frame(
{ {
'reference_faces': reference_faces, 'reference_faces': reference_faces,
'target_vision_frame': target_vision_frame 'target_vision_frame': target_vision_frame
}) })
write_image(output_path, result_frame) write_image(output_path, output_vision_frame)
def process_video(source_paths : List[str], temp_frame_paths : List[str]) -> None: def process_video(source_paths : List[str], temp_frame_paths : List[str]) -> None:

View File

@ -7,14 +7,15 @@ import onnxruntime
import facefusion.globals import facefusion.globals
import facefusion.processors.frame.core as frame_processors import facefusion.processors.frame.core as frame_processors
from facefusion import config, logger, wording from facefusion import config, process_manager, logger, wording
from facefusion.face_analyser import get_many_faces, clear_face_analyser, find_similar_faces, get_one_face from facefusion.face_analyser import get_many_faces, clear_face_analyser, find_similar_faces, get_one_face
from facefusion.face_masker import create_static_box_mask, create_occlusion_mask, clear_face_occluder from facefusion.face_masker import create_static_box_mask, create_occlusion_mask, clear_face_occluder
from facefusion.face_helper import warp_face_by_face_landmark_5, paste_back from facefusion.face_helper import warp_face_by_face_landmark_5, paste_back
from facefusion.execution_helper import apply_execution_provider_options from facefusion.execution import apply_execution_provider_options
from facefusion.content_analyser import clear_content_analyser from facefusion.content_analyser import clear_content_analyser
from facefusion.face_store import get_reference_faces from facefusion.face_store import get_reference_faces
from facefusion.typing import Face, VisionFrame, Update_Process, ProcessMode, ModelSet, OptionsWithModel, QueuePayload from facefusion.normalizer import normalize_output_path
from facefusion.typing import Face, VisionFrame, UpdateProcess, ProcessMode, ModelSet, OptionsWithModel, QueuePayload
from facefusion.common_helper import create_metavar from facefusion.common_helper import create_metavar
from facefusion.filesystem import is_file, is_image, is_video, resolve_relative_path from facefusion.filesystem import is_file, is_image, is_video, resolve_relative_path
from facefusion.download import conditional_download, is_download_done from facefusion.download import conditional_download, is_download_done
@ -150,7 +151,7 @@ def pre_process(mode : ProcessMode) -> bool:
if mode in [ 'output', 'preview' ] and not is_image(facefusion.globals.target_path) and not is_video(facefusion.globals.target_path): if mode in [ 'output', 'preview' ] and not is_image(facefusion.globals.target_path) and not is_video(facefusion.globals.target_path):
logger.error(wording.get('select_image_or_video_target') + wording.get('exclamation_mark'), NAME) logger.error(wording.get('select_image_or_video_target') + wording.get('exclamation_mark'), NAME)
return False return False
if mode == 'output' and not facefusion.globals.output_path: if mode == 'output' and not normalize_output_path(facefusion.globals.target_path, facefusion.globals.output_path):
logger.error(wording.get('select_file_or_directory_output') + wording.get('exclamation_mark'), NAME) logger.error(wording.get('select_file_or_directory_output') + wording.get('exclamation_mark'), NAME)
return False return False
return True return True
@ -169,7 +170,7 @@ def post_process() -> None:
def enhance_face(target_face: Face, temp_vision_frame : VisionFrame) -> VisionFrame: def enhance_face(target_face: Face, temp_vision_frame : VisionFrame) -> VisionFrame:
model_template = get_options('model').get('template') model_template = get_options('model').get('template')
model_size = get_options('model').get('size') model_size = get_options('model').get('size')
crop_vision_frame, affine_matrix = warp_face_by_face_landmark_5(temp_vision_frame, target_face.landmark['5/68'], model_template, model_size) crop_vision_frame, affine_matrix = warp_face_by_face_landmark_5(temp_vision_frame, target_face.landmarks.get('5/68'), model_template, model_size)
box_mask = create_static_box_mask(crop_vision_frame.shape[:2][::-1], facefusion.globals.face_mask_blur, (0, 0, 0, 0)) box_mask = create_static_box_mask(crop_vision_frame.shape[:2][::-1], facefusion.globals.face_mask_blur, (0, 0, 0, 0))
crop_mask_list =\ crop_mask_list =\
[ [
@ -230,50 +231,50 @@ def get_reference_frame(source_face : Face, target_face : Face, temp_vision_fram
def process_frame(inputs : FaceEnhancerInputs) -> VisionFrame: def process_frame(inputs : FaceEnhancerInputs) -> VisionFrame:
reference_faces = inputs['reference_faces'] reference_faces = inputs.get('reference_faces')
target_vision_frame = inputs['target_vision_frame'] target_vision_frame = inputs.get('target_vision_frame')
if 'reference' in facefusion.globals.face_selector_mode: if facefusion.globals.face_selector_mode == 'many':
similar_faces = find_similar_faces(reference_faces, target_vision_frame, facefusion.globals.reference_face_distance)
if similar_faces:
for similar_face in similar_faces:
target_vision_frame = enhance_face(similar_face, target_vision_frame)
if 'one' in facefusion.globals.face_selector_mode:
target_face = get_one_face(target_vision_frame)
if target_face:
target_vision_frame = enhance_face(target_face, target_vision_frame)
if 'many' in facefusion.globals.face_selector_mode:
many_faces = get_many_faces(target_vision_frame) many_faces = get_many_faces(target_vision_frame)
if many_faces: if many_faces:
for target_face in many_faces: for target_face in many_faces:
target_vision_frame = enhance_face(target_face, target_vision_frame) target_vision_frame = enhance_face(target_face, target_vision_frame)
if facefusion.globals.face_selector_mode == 'one':
target_face = get_one_face(target_vision_frame)
if target_face:
target_vision_frame = enhance_face(target_face, target_vision_frame)
if facefusion.globals.face_selector_mode == 'reference':
similar_faces = find_similar_faces(reference_faces, target_vision_frame, facefusion.globals.reference_face_distance)
if similar_faces:
for similar_face in similar_faces:
target_vision_frame = enhance_face(similar_face, target_vision_frame)
return target_vision_frame return target_vision_frame
def process_frames(source_path : List[str], queue_payloads : List[QueuePayload], update_progress : Update_Process) -> None: def process_frames(source_path : List[str], queue_payloads : List[QueuePayload], update_progress : UpdateProcess) -> None:
reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None
for queue_payload in queue_payloads: for queue_payload in process_manager.manage(queue_payloads):
target_vision_path = queue_payload['frame_path'] target_vision_path = queue_payload['frame_path']
target_vision_frame = read_image(target_vision_path) target_vision_frame = read_image(target_vision_path)
result_frame = process_frame( output_vision_frame = process_frame(
{ {
'reference_faces': reference_faces, 'reference_faces': reference_faces,
'target_vision_frame': target_vision_frame 'target_vision_frame': target_vision_frame
}) })
write_image(target_vision_path, result_frame) write_image(target_vision_path, output_vision_frame)
update_progress() update_progress()
def process_image(source_path : str, target_path : str, output_path : str) -> None: def process_image(source_path : str, target_path : str, output_path : str) -> None:
reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None
target_vision_frame = read_static_image(target_path) target_vision_frame = read_static_image(target_path)
result_frame = process_frame( output_vision_frame = process_frame(
{ {
'reference_faces': reference_faces, 'reference_faces': reference_faces,
'target_vision_frame': target_vision_frame 'target_vision_frame': target_vision_frame
}) })
write_image(output_path, result_frame) write_image(output_path, output_vision_frame)
def process_video(source_paths : List[str], temp_frame_paths : List[str]) -> None: def process_video(source_paths : List[str], temp_frame_paths : List[str]) -> None:

View File

@ -8,14 +8,16 @@ from onnx import numpy_helper
import facefusion.globals import facefusion.globals
import facefusion.processors.frame.core as frame_processors import facefusion.processors.frame.core as frame_processors
from facefusion import config, logger, wording from facefusion import config, process_manager, logger, wording
from facefusion.execution_helper import apply_execution_provider_options from facefusion.execution import apply_execution_provider_options
from facefusion.face_analyser import get_one_face, get_average_face, get_many_faces, find_similar_faces, clear_face_analyser from facefusion.face_analyser import get_one_face, get_average_face, get_many_faces, find_similar_faces, clear_face_analyser
from facefusion.face_masker import create_static_box_mask, create_occlusion_mask, create_region_mask, clear_face_occluder, clear_face_parser from facefusion.face_masker import create_static_box_mask, create_occlusion_mask, create_region_mask, clear_face_occluder, clear_face_parser
from facefusion.face_helper import warp_face_by_face_landmark_5, paste_back from facefusion.face_helper import warp_face_by_face_landmark_5, paste_back
from facefusion.face_store import get_reference_faces from facefusion.face_store import get_reference_faces
from facefusion.common_helper import extract_major_version
from facefusion.content_analyser import clear_content_analyser from facefusion.content_analyser import clear_content_analyser
from facefusion.typing import Face, Embedding, VisionFrame, Update_Process, ProcessMode, ModelSet, OptionsWithModel, QueuePayload from facefusion.normalizer import normalize_output_path
from facefusion.typing import Face, Embedding, VisionFrame, UpdateProcess, ProcessMode, ModelSet, OptionsWithModel, QueuePayload
from facefusion.filesystem import is_file, is_image, has_image, is_video, filter_image_paths, resolve_relative_path from facefusion.filesystem import is_file, is_image, has_image, is_video, filter_image_paths, resolve_relative_path
from facefusion.download import conditional_download, is_download_done from facefusion.download import conditional_download, is_download_done
from facefusion.vision import read_image, read_static_image, read_static_images, write_image from facefusion.vision import read_image, read_static_image, read_static_images, write_image
@ -144,7 +146,8 @@ def set_options(key : Literal['model'], value : Any) -> None:
def register_args(program : ArgumentParser) -> None: def register_args(program : ArgumentParser) -> None:
if onnxruntime.__version__ == '1.17.0': onnxruntime_version = extract_major_version(onnxruntime.__version__)
if onnxruntime_version > (1, 16):
face_swapper_model_fallback = 'inswapper_128' face_swapper_model_fallback = 'inswapper_128'
else: else:
face_swapper_model_fallback = 'inswapper_128_fp16' face_swapper_model_fallback = 'inswapper_128_fp16'
@ -197,7 +200,7 @@ def pre_process(mode : ProcessMode) -> bool:
if mode in [ 'output', 'preview' ] and not is_image(facefusion.globals.target_path) and not is_video(facefusion.globals.target_path): if mode in [ 'output', 'preview' ] and not is_image(facefusion.globals.target_path) and not is_video(facefusion.globals.target_path):
logger.error(wording.get('select_image_or_video_target') + wording.get('exclamation_mark'), NAME) logger.error(wording.get('select_image_or_video_target') + wording.get('exclamation_mark'), NAME)
return False return False
if mode == 'output' and not facefusion.globals.output_path: if mode == 'output' and not normalize_output_path(facefusion.globals.target_path, facefusion.globals.output_path):
logger.error(wording.get('select_file_or_directory_output') + wording.get('exclamation_mark'), NAME) logger.error(wording.get('select_file_or_directory_output') + wording.get('exclamation_mark'), NAME)
return False return False
return True return True
@ -218,7 +221,7 @@ def post_process() -> None:
def swap_face(source_face : Face, target_face : Face, temp_vision_frame : VisionFrame) -> VisionFrame: def swap_face(source_face : Face, target_face : Face, temp_vision_frame : VisionFrame) -> VisionFrame:
model_template = get_options('model').get('template') model_template = get_options('model').get('template')
model_size = get_options('model').get('size') model_size = get_options('model').get('size')
crop_vision_frame, affine_matrix = warp_face_by_face_landmark_5(temp_vision_frame, target_face.landmark['5/68'], model_template, model_size) crop_vision_frame, affine_matrix = warp_face_by_face_landmark_5(temp_vision_frame, target_face.landmarks.get('5/68'), model_template, model_size)
crop_mask_list = [] crop_mask_list = []
if 'box' in facefusion.globals.face_mask_types: if 'box' in facefusion.globals.face_mask_types:
@ -259,9 +262,9 @@ def prepare_source_frame(source_face : Face) -> VisionFrame:
model_type = get_options('model').get('type') model_type = get_options('model').get('type')
source_vision_frame = read_static_image(facefusion.globals.source_paths[0]) source_vision_frame = read_static_image(facefusion.globals.source_paths[0])
if model_type == 'blendswap': if model_type == 'blendswap':
source_vision_frame, _ = warp_face_by_face_landmark_5(source_vision_frame, source_face.landmark['5/68'], 'arcface_112_v2', (112, 112)) source_vision_frame, _ = warp_face_by_face_landmark_5(source_vision_frame, source_face.landmarks.get('5/68'), 'arcface_112_v2', (112, 112))
if model_type == 'uniface': if model_type == 'uniface':
source_vision_frame, _ = warp_face_by_face_landmark_5(source_vision_frame, source_face.landmark['5/68'], 'ffhq_512', (256, 256)) source_vision_frame, _ = warp_face_by_face_landmark_5(source_vision_frame, source_face.landmarks.get('5/68'), 'ffhq_512', (256, 256))
source_vision_frame = source_vision_frame[:, :, ::-1] / 255.0 source_vision_frame = source_vision_frame[:, :, ::-1] / 255.0
source_vision_frame = source_vision_frame.transpose(2, 0, 1) source_vision_frame = source_vision_frame.transpose(2, 0, 1)
source_vision_frame = numpy.expand_dims(source_vision_frame, axis = 0).astype(numpy.float32) source_vision_frame = numpy.expand_dims(source_vision_frame, axis = 0).astype(numpy.float32)
@ -301,42 +304,42 @@ def get_reference_frame(source_face : Face, target_face : Face, temp_vision_fram
def process_frame(inputs : FaceSwapperInputs) -> VisionFrame: def process_frame(inputs : FaceSwapperInputs) -> VisionFrame:
reference_faces = inputs['reference_faces'] reference_faces = inputs.get('reference_faces')
source_face = inputs['source_face'] source_face = inputs.get('source_face')
target_vision_frame = inputs['target_vision_frame'] target_vision_frame = inputs.get('target_vision_frame')
if 'reference' in facefusion.globals.face_selector_mode: if facefusion.globals.face_selector_mode == 'many':
similar_faces = find_similar_faces(reference_faces, target_vision_frame, facefusion.globals.reference_face_distance)
if similar_faces:
for similar_face in similar_faces:
target_vision_frame = swap_face(source_face, similar_face, target_vision_frame)
if 'one' in facefusion.globals.face_selector_mode:
target_face = get_one_face(target_vision_frame)
if target_face:
target_vision_frame = swap_face(source_face, target_face, target_vision_frame)
if 'many' in facefusion.globals.face_selector_mode:
many_faces = get_many_faces(target_vision_frame) many_faces = get_many_faces(target_vision_frame)
if many_faces: if many_faces:
for target_face in many_faces: for target_face in many_faces:
target_vision_frame = swap_face(source_face, target_face, target_vision_frame) target_vision_frame = swap_face(source_face, target_face, target_vision_frame)
if facefusion.globals.face_selector_mode == 'one':
target_face = get_one_face(target_vision_frame)
if target_face:
target_vision_frame = swap_face(source_face, target_face, target_vision_frame)
if facefusion.globals.face_selector_mode == 'reference':
similar_faces = find_similar_faces(reference_faces, target_vision_frame, facefusion.globals.reference_face_distance)
if similar_faces:
for similar_face in similar_faces:
target_vision_frame = swap_face(source_face, similar_face, target_vision_frame)
return target_vision_frame return target_vision_frame
def process_frames(source_paths : List[str], queue_payloads : List[QueuePayload], update_progress : Update_Process) -> None: def process_frames(source_paths : List[str], queue_payloads : List[QueuePayload], update_progress : UpdateProcess) -> None:
reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None
source_frames = read_static_images(source_paths) source_frames = read_static_images(source_paths)
source_face = get_average_face(source_frames) source_face = get_average_face(source_frames)
for queue_payload in queue_payloads: for queue_payload in process_manager.manage(queue_payloads):
target_vision_path = queue_payload['frame_path'] target_vision_path = queue_payload['frame_path']
target_vision_frame = read_image(target_vision_path) target_vision_frame = read_image(target_vision_path)
result_frame = process_frame( output_vision_frame = process_frame(
{ {
'reference_faces': reference_faces, 'reference_faces': reference_faces,
'source_face': source_face, 'source_face': source_face,
'target_vision_frame': target_vision_frame 'target_vision_frame': target_vision_frame
}) })
write_image(target_vision_path, result_frame) write_image(target_vision_path, output_vision_frame)
update_progress() update_progress()
@ -345,13 +348,13 @@ def process_image(source_paths : List[str], target_path : str, output_path : str
source_frames = read_static_images(source_paths) source_frames = read_static_images(source_paths)
source_face = get_average_face(source_frames) source_face = get_average_face(source_frames)
target_vision_frame = read_static_image(target_path) target_vision_frame = read_static_image(target_path)
result_frame = process_frame( output_vision_frame = process_frame(
{ {
'reference_faces': reference_faces, 'reference_faces': reference_faces,
'source_face': source_face, 'source_face': source_face,
'target_vision_frame': target_vision_frame 'target_vision_frame': target_vision_frame
}) })
write_image(output_path, result_frame) write_image(output_path, output_vision_frame)
def process_video(source_paths : List[str], temp_frame_paths : List[str]) -> None: def process_video(source_paths : List[str], temp_frame_paths : List[str]) -> None:

View File

@ -2,46 +2,63 @@ from typing import Any, List, Literal, Optional
from argparse import ArgumentParser from argparse import ArgumentParser
import threading import threading
import cv2 import cv2
from basicsr.archs.rrdbnet_arch import RRDBNet import numpy
from realesrgan import RealESRGANer import onnxruntime
import facefusion.globals import facefusion.globals
import facefusion.processors.frame.core as frame_processors import facefusion.processors.frame.core as frame_processors
from facefusion import config, logger, wording from facefusion import config, process_manager, logger, wording
from facefusion.face_analyser import clear_face_analyser from facefusion.face_analyser import clear_face_analyser
from facefusion.content_analyser import clear_content_analyser from facefusion.content_analyser import clear_content_analyser
from facefusion.typing import Face, VisionFrame, Update_Process, ProcessMode, ModelSet, OptionsWithModel, QueuePayload from facefusion.execution import apply_execution_provider_options
from facefusion.normalizer import normalize_output_path
from facefusion.typing import Face, VisionFrame, UpdateProcess, ProcessMode, ModelSet, OptionsWithModel, QueuePayload
from facefusion.common_helper import create_metavar from facefusion.common_helper import create_metavar
from facefusion.execution_helper import map_torch_backend from facefusion.filesystem import is_file, resolve_relative_path, is_image, is_video
from facefusion.filesystem import is_file, resolve_relative_path
from facefusion.download import conditional_download, is_download_done from facefusion.download import conditional_download, is_download_done
from facefusion.vision import read_image, read_static_image, write_image from facefusion.vision import read_image, read_static_image, write_image, merge_tile_frames, create_tile_frames
from facefusion.processors.frame.typings import FrameEnhancerInputs from facefusion.processors.frame.typings import FrameEnhancerInputs
from facefusion.processors.frame import globals as frame_processors_globals from facefusion.processors.frame import globals as frame_processors_globals
from facefusion.processors.frame import choices as frame_processors_choices from facefusion.processors.frame import choices as frame_processors_choices
FRAME_PROCESSOR = None FRAME_PROCESSOR = None
THREAD_SEMAPHORE : threading.Semaphore = threading.Semaphore()
THREAD_LOCK : threading.Lock = threading.Lock() THREAD_LOCK : threading.Lock = threading.Lock()
NAME = __name__.upper() NAME = __name__.upper()
MODELS : ModelSet =\ MODELS : ModelSet =\
{ {
'real_esrgan_x2plus': 'lsdir_x4':
{ {
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/real_esrgan_x2plus.pth', 'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/lsdir_x4.onnx',
'path': resolve_relative_path('../.assets/models/real_esrgan_x2plus.pth'), 'path': resolve_relative_path('../.assets/models/lsdir_x4.onnx'),
'scale': 2 'size': (128, 8, 2),
},
'real_esrgan_x4plus':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/real_esrgan_x4plus.pth',
'path': resolve_relative_path('../.assets/models/real_esrgan_x4plus.pth'),
'scale': 4 'scale': 4
}, },
'real_esrnet_x4plus': 'nomos8k_sc_x4':
{ {
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/real_esrnet_x4plus.pth', 'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/nomos8k_sc_x4.onnx',
'path': resolve_relative_path('../.assets/models/real_esrnet_x4plus.pth'), 'path': resolve_relative_path('../.assets/models/nomos8k_sc_x4.onnx'),
'size': (128, 8, 2),
'scale': 4
},
'real_esrgan_x4':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/real_esrgan_x4.onnx',
'path': resolve_relative_path('../.assets/models/real_esrgan_x4.onnx'),
'size': (128, 8, 2),
'scale': 4
},
'real_esrgan_x4_fp16':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/real_esrgan_x4_fp16.onnx',
'path': resolve_relative_path('../.assets/models/real_esrgan_x4_fp16.onnx'),
'size': (128, 8, 2),
'scale': 4
},
'span_kendata_x4':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/span_kendata_x4.onnx',
'path': resolve_relative_path('../.assets/models/span_kendata_x4.onnx'),
'size': (128, 8, 2),
'scale': 4 'scale': 4
} }
} }
@ -54,17 +71,7 @@ def get_frame_processor() -> Any:
with THREAD_LOCK: with THREAD_LOCK:
if FRAME_PROCESSOR is None: if FRAME_PROCESSOR is None:
model_path = get_options('model').get('path') model_path = get_options('model').get('path')
model_scale = get_options('model').get('scale') FRAME_PROCESSOR = onnxruntime.InferenceSession(model_path, providers = apply_execution_provider_options(facefusion.globals.execution_providers))
FRAME_PROCESSOR = RealESRGANer(
model_path = model_path,
model = RRDBNet(
num_in_ch = 3,
num_out_ch = 3,
scale = model_scale
),
device = map_torch_backend(facefusion.globals.execution_providers),
scale = model_scale
)
return FRAME_PROCESSOR return FRAME_PROCESSOR
@ -92,7 +99,7 @@ def set_options(key : Literal['model'], value : Any) -> None:
def register_args(program : ArgumentParser) -> None: def register_args(program : ArgumentParser) -> None:
program.add_argument('--frame-enhancer-model', help = wording.get('help.frame_enhancer_model'), default = config.get_str_value('frame_processors.frame_enhancer_model', 'real_esrgan_x2plus'), choices = frame_processors_choices.frame_enhancer_models) program.add_argument('--frame-enhancer-model', help = wording.get('help.frame_enhancer_model'), default = config.get_str_value('frame_processors.frame_enhancer_model', 'span_kendata_x4'), choices = frame_processors_choices.frame_enhancer_models)
program.add_argument('--frame-enhancer-blend', help = wording.get('help.frame_enhancer_blend'), type = int, default = config.get_int_value('frame_processors.frame_enhancer_blend', '80'), choices = frame_processors_choices.frame_enhancer_blend_range, metavar = create_metavar(frame_processors_choices.frame_enhancer_blend_range)) program.add_argument('--frame-enhancer-blend', help = wording.get('help.frame_enhancer_blend'), type = int, default = config.get_int_value('frame_processors.frame_enhancer_blend', '80'), choices = frame_processors_choices.frame_enhancer_blend_range, metavar = create_metavar(frame_processors_choices.frame_enhancer_blend_range))
@ -123,7 +130,10 @@ def post_check() -> bool:
def pre_process(mode : ProcessMode) -> bool: def pre_process(mode : ProcessMode) -> bool:
if mode == 'output' and not facefusion.globals.output_path: if mode in [ 'output', 'preview' ] and not is_image(facefusion.globals.target_path) and not is_video(facefusion.globals.target_path):
logger.error(wording.get('select_image_or_video_target') + wording.get('exclamation_mark'), NAME)
return False
if mode == 'output' and not normalize_output_path(facefusion.globals.target_path, facefusion.globals.output_path):
logger.error(wording.get('select_file_or_directory_output') + wording.get('exclamation_mark'), NAME) logger.error(wording.get('select_file_or_directory_output') + wording.get('exclamation_mark'), NAME)
return False return False
return True return True
@ -139,12 +149,36 @@ def post_process() -> None:
def enhance_frame(temp_vision_frame : VisionFrame) -> VisionFrame: def enhance_frame(temp_vision_frame : VisionFrame) -> VisionFrame:
with THREAD_SEMAPHORE: frame_processor = get_frame_processor()
paste_vision_frame, _ = get_frame_processor().enhance(temp_vision_frame) size = get_options('model').get('size')
temp_vision_frame = blend_frame(temp_vision_frame, paste_vision_frame) scale = get_options('model').get('scale')
temp_height, temp_width = temp_vision_frame.shape[:2]
tile_vision_frames, pad_width, pad_height = create_tile_frames(temp_vision_frame, size)
for index, tile_vision_frame in enumerate(tile_vision_frames):
tile_vision_frame = frame_processor.run(None,
{
frame_processor.get_inputs()[0].name : prepare_tile_frame(tile_vision_frame)
})[0]
tile_vision_frames[index] = normalize_tile_frame(tile_vision_frame)
merge_vision_frame = merge_tile_frames(tile_vision_frames, temp_width * scale, temp_height * scale, pad_width * scale, pad_height * scale, (size[0] * scale, size[1] * scale, size[2] * scale))
temp_vision_frame = blend_frame(temp_vision_frame, merge_vision_frame)
return temp_vision_frame return temp_vision_frame
def prepare_tile_frame(vision_tile_frame : VisionFrame) -> VisionFrame:
vision_tile_frame = numpy.expand_dims(vision_tile_frame[:,:,::-1], axis = 0)
vision_tile_frame = vision_tile_frame.transpose(0, 3, 1, 2)
vision_tile_frame = vision_tile_frame.astype(numpy.float32) / 255
return vision_tile_frame
def normalize_tile_frame(vision_tile_frame : VisionFrame) -> VisionFrame:
vision_tile_frame = vision_tile_frame.transpose(0, 2, 3, 1).squeeze(0) * 255
vision_tile_frame = vision_tile_frame.clip(0, 255).astype(numpy.uint8)[:,:,::-1]
return vision_tile_frame
def blend_frame(temp_vision_frame : VisionFrame, paste_vision_frame : VisionFrame) -> VisionFrame: def blend_frame(temp_vision_frame : VisionFrame, paste_vision_frame : VisionFrame) -> VisionFrame:
frame_enhancer_blend = 1 - (frame_processors_globals.frame_enhancer_blend / 100) frame_enhancer_blend = 1 - (frame_processors_globals.frame_enhancer_blend / 100)
temp_vision_frame = cv2.resize(temp_vision_frame, (paste_vision_frame.shape[1], paste_vision_frame.shape[0])) temp_vision_frame = cv2.resize(temp_vision_frame, (paste_vision_frame.shape[1], paste_vision_frame.shape[0]))
@ -157,29 +191,29 @@ def get_reference_frame(source_face : Face, target_face : Face, temp_vision_fram
def process_frame(inputs : FrameEnhancerInputs) -> VisionFrame: def process_frame(inputs : FrameEnhancerInputs) -> VisionFrame:
target_vision_frame = inputs['target_vision_frame'] target_vision_frame = inputs.get('target_vision_frame')
return enhance_frame(target_vision_frame) return enhance_frame(target_vision_frame)
def process_frames(source_paths : List[str], queue_payloads : List[QueuePayload], update_progress : Update_Process) -> None: def process_frames(source_paths : List[str], queue_payloads : List[QueuePayload], update_progress : UpdateProcess) -> None:
for queue_payload in queue_payloads: for queue_payload in process_manager.manage(queue_payloads):
target_vision_path = queue_payload['frame_path'] target_vision_path = queue_payload['frame_path']
target_vision_frame = read_image(target_vision_path) target_vision_frame = read_image(target_vision_path)
result_frame = process_frame( output_vision_frame = process_frame(
{ {
'target_vision_frame': target_vision_frame 'target_vision_frame': target_vision_frame
}) })
write_image(target_vision_path, result_frame) write_image(target_vision_path, output_vision_frame)
update_progress() update_progress()
def process_image(source_paths : List[str], target_path : str, output_path : str) -> None: def process_image(source_paths : List[str], target_path : str, output_path : str) -> None:
target_vision_frame = read_static_image(target_path) target_vision_frame = read_static_image(target_path)
result_frame = process_frame( output_vision_frame = process_frame(
{ {
'target_vision_frame': target_vision_frame 'target_vision_frame': target_vision_frame
}) })
write_image(output_path, result_frame) write_image(output_path, output_vision_frame)
def process_video(source_paths : List[str], temp_frame_paths : List[str]) -> None: def process_video(source_paths : List[str], temp_frame_paths : List[str]) -> None:

View File

@ -7,17 +7,18 @@ import onnxruntime
import facefusion.globals import facefusion.globals
import facefusion.processors.frame.core as frame_processors import facefusion.processors.frame.core as frame_processors
from facefusion import config, logger, wording from facefusion import config, process_manager, logger, wording
from facefusion.execution_helper import apply_execution_provider_options from facefusion.execution import apply_execution_provider_options
from facefusion.face_analyser import get_one_face, get_many_faces, find_similar_faces, clear_face_analyser from facefusion.face_analyser import get_one_face, get_many_faces, find_similar_faces, clear_face_analyser
from facefusion.face_masker import create_static_box_mask, create_occlusion_mask, create_mouth_mask, clear_face_occluder, clear_face_parser from facefusion.face_masker import create_static_box_mask, create_occlusion_mask, create_mouth_mask, clear_face_occluder, clear_face_parser
from facefusion.face_helper import warp_face_by_face_landmark_5, warp_face_by_bounding_box, paste_back, create_bounding_box_from_landmark from facefusion.face_helper import warp_face_by_face_landmark_5, warp_face_by_bounding_box, paste_back, create_bounding_box_from_face_landmark_68
from facefusion.face_store import get_reference_faces from facefusion.face_store import get_reference_faces
from facefusion.content_analyser import clear_content_analyser from facefusion.content_analyser import clear_content_analyser
from facefusion.typing import Face, VisionFrame, Update_Process, ProcessMode, ModelSet, OptionsWithModel, AudioFrame, QueuePayload from facefusion.normalizer import normalize_output_path
from facefusion.typing import Face, VisionFrame, UpdateProcess, ProcessMode, ModelSet, OptionsWithModel, AudioFrame, QueuePayload
from facefusion.filesystem import is_file, has_audio, resolve_relative_path from facefusion.filesystem import is_file, has_audio, resolve_relative_path
from facefusion.download import conditional_download, is_download_done from facefusion.download import conditional_download, is_download_done
from facefusion.audio import read_static_audio, get_audio_frame from facefusion.audio import read_static_audio, get_audio_frame, create_empty_audio_frame
from facefusion.filesystem import is_image, is_video, filter_audio_paths from facefusion.filesystem import is_image, is_video, filter_audio_paths
from facefusion.common_helper import get_first from facefusion.common_helper import get_first
from facefusion.vision import read_image, write_image, read_static_image from facefusion.vision import read_image, write_image, read_static_image
@ -109,7 +110,7 @@ def pre_process(mode : ProcessMode) -> bool:
if mode in [ 'output', 'preview' ] and not is_image(facefusion.globals.target_path) and not is_video(facefusion.globals.target_path): if mode in [ 'output', 'preview' ] and not is_image(facefusion.globals.target_path) and not is_video(facefusion.globals.target_path):
logger.error(wording.get('select_image_or_video_target') + wording.get('exclamation_mark'), NAME) logger.error(wording.get('select_image_or_video_target') + wording.get('exclamation_mark'), NAME)
return False return False
if mode == 'output' and not facefusion.globals.output_path: if mode == 'output' and not normalize_output_path(facefusion.globals.target_path, facefusion.globals.output_path):
logger.error(wording.get('select_file_or_directory_output') + wording.get('exclamation_mark'), NAME) logger.error(wording.get('select_file_or_directory_output') + wording.get('exclamation_mark'), NAME)
return False return False
return True return True
@ -129,23 +130,24 @@ def post_process() -> None:
def sync_lip(target_face : Face, temp_audio_frame : AudioFrame, temp_vision_frame : VisionFrame) -> VisionFrame: def sync_lip(target_face : Face, temp_audio_frame : AudioFrame, temp_vision_frame : VisionFrame) -> VisionFrame:
frame_processor = get_frame_processor() frame_processor = get_frame_processor()
crop_mask_list = []
temp_audio_frame = prepare_audio_frame(temp_audio_frame) temp_audio_frame = prepare_audio_frame(temp_audio_frame)
crop_vision_frame, affine_matrix = warp_face_by_face_landmark_5(temp_vision_frame, target_face.landmark['5/68'], 'ffhq_512', (512, 512)) crop_vision_frame, affine_matrix = warp_face_by_face_landmark_5(temp_vision_frame, target_face.landmarks.get('5/68'), 'ffhq_512', (512, 512))
face_landmark_68 = cv2.transform(target_face.landmark['68'].reshape(1, -1, 2), affine_matrix).reshape(-1, 2) if numpy.any(target_face.landmarks.get('68')):
bounding_box = create_bounding_box_from_landmark(face_landmark_68) face_landmark_68 = cv2.transform(target_face.landmarks.get('68').reshape(1, -1, 2), affine_matrix).reshape(-1, 2)
bounding_box = create_bounding_box_from_face_landmark_68(face_landmark_68)
bounding_box[1] -= numpy.abs(bounding_box[3] - bounding_box[1]) * 0.125 bounding_box[1] -= numpy.abs(bounding_box[3] - bounding_box[1]) * 0.125
mouth_mask = create_mouth_mask(face_landmark_68) mouth_mask = create_mouth_mask(face_landmark_68)
crop_mask_list.append(mouth_mask)
else:
bounding_box = target_face.bounding_box
box_mask = create_static_box_mask(crop_vision_frame.shape[:2][::-1], facefusion.globals.face_mask_blur, facefusion.globals.face_mask_padding) box_mask = create_static_box_mask(crop_vision_frame.shape[:2][::-1], facefusion.globals.face_mask_blur, facefusion.globals.face_mask_padding)
crop_mask_list =\ crop_mask_list.append(box_mask)
[
mouth_mask,
box_mask
]
if 'occlusion' in facefusion.globals.face_mask_types: if 'occlusion' in facefusion.globals.face_mask_types:
occlusion_mask = create_occlusion_mask(crop_vision_frame) occlusion_mask = create_occlusion_mask(crop_vision_frame)
crop_mask_list.append(occlusion_mask) crop_mask_list.append(occlusion_mask)
close_vision_frame, closeup_matrix = warp_face_by_bounding_box(crop_vision_frame, bounding_box, (96, 96)) close_vision_frame, close_matrix = warp_face_by_bounding_box(crop_vision_frame, bounding_box, (96, 96))
close_vision_frame = prepare_crop_frame(close_vision_frame) close_vision_frame = prepare_crop_frame(close_vision_frame)
close_vision_frame = frame_processor.run(None, close_vision_frame = frame_processor.run(None,
{ {
@ -153,7 +155,7 @@ def sync_lip(target_face : Face, temp_audio_frame : AudioFrame, temp_vision_fram
'target': close_vision_frame 'target': close_vision_frame
})[0] })[0]
crop_vision_frame = normalize_crop_frame(close_vision_frame) crop_vision_frame = normalize_crop_frame(close_vision_frame)
crop_vision_frame = cv2.warpAffine(crop_vision_frame, cv2.invertAffineTransform(closeup_matrix), (512, 512), borderMode = cv2.BORDER_REPLICATE) crop_vision_frame = cv2.warpAffine(crop_vision_frame, cv2.invertAffineTransform(close_matrix), (512, 512), borderMode = cv2.BORDER_REPLICATE)
crop_mask = numpy.minimum.reduce(crop_mask_list) crop_mask = numpy.minimum.reduce(crop_mask_list)
paste_vision_frame = paste_back(temp_vision_frame, crop_vision_frame, crop_mask, affine_matrix) paste_vision_frame = paste_back(temp_vision_frame, crop_vision_frame, crop_mask, affine_matrix)
return paste_vision_frame return paste_vision_frame
@ -188,60 +190,59 @@ def get_reference_frame(source_face : Face, target_face : Face, temp_vision_fram
def process_frame(inputs : LipSyncerInputs) -> VisionFrame: def process_frame(inputs : LipSyncerInputs) -> VisionFrame:
reference_faces = inputs['reference_faces'] reference_faces = inputs.get('reference_faces')
source_audio_frame = inputs['source_audio_frame'] source_audio_frame = inputs.get('source_audio_frame')
target_vision_frame = inputs['target_vision_frame'] target_vision_frame = inputs.get('target_vision_frame')
is_source_audio_frame = isinstance(source_audio_frame, numpy.ndarray) and source_audio_frame.any()
if 'reference' in facefusion.globals.face_selector_mode: if facefusion.globals.face_selector_mode == 'many':
similar_faces = find_similar_faces(reference_faces, target_vision_frame, facefusion.globals.reference_face_distance)
if similar_faces and is_source_audio_frame:
for similar_face in similar_faces:
target_vision_frame = sync_lip(similar_face, source_audio_frame, target_vision_frame)
if 'one' in facefusion.globals.face_selector_mode:
target_face = get_one_face(target_vision_frame)
if target_face and is_source_audio_frame:
target_vision_frame = sync_lip(target_face, source_audio_frame, target_vision_frame)
if 'many' in facefusion.globals.face_selector_mode:
many_faces = get_many_faces(target_vision_frame) many_faces = get_many_faces(target_vision_frame)
if many_faces and is_source_audio_frame: if many_faces:
for target_face in many_faces: for target_face in many_faces:
target_vision_frame = sync_lip(target_face, source_audio_frame, target_vision_frame) target_vision_frame = sync_lip(target_face, source_audio_frame, target_vision_frame)
if facefusion.globals.face_selector_mode == 'one':
target_face = get_one_face(target_vision_frame)
if target_face:
target_vision_frame = sync_lip(target_face, source_audio_frame, target_vision_frame)
if facefusion.globals.face_selector_mode == 'reference':
similar_faces = find_similar_faces(reference_faces, target_vision_frame, facefusion.globals.reference_face_distance)
if similar_faces:
for similar_face in similar_faces:
target_vision_frame = sync_lip(similar_face, source_audio_frame, target_vision_frame)
return target_vision_frame return target_vision_frame
def process_frames(source_paths : List[str], queue_payloads : List[QueuePayload], update_progress : Update_Process) -> None: def process_frames(source_paths : List[str], queue_payloads : List[QueuePayload], update_progress : UpdateProcess) -> None:
reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None
source_audio_path = get_first(filter_audio_paths(source_paths)) source_audio_path = get_first(filter_audio_paths(source_paths))
target_video_fps = facefusion.globals.output_video_fps
for queue_payload in queue_payloads: for queue_payload in process_manager.manage(queue_payloads):
frame_number = queue_payload['frame_number'] frame_number = queue_payload['frame_number']
target_vision_path = queue_payload['frame_path'] target_vision_path = queue_payload['frame_path']
source_audio_frame = get_audio_frame(source_audio_path, target_video_fps, frame_number) source_audio_frame = get_audio_frame(source_audio_path, facefusion.globals.output_video_fps, frame_number)
if not numpy.any(source_audio_frame):
source_audio_frame = create_empty_audio_frame()
target_vision_frame = read_image(target_vision_path) target_vision_frame = read_image(target_vision_path)
result_frame = process_frame( output_vision_frame = process_frame(
{ {
'reference_faces': reference_faces, 'reference_faces': reference_faces,
'source_audio_frame': source_audio_frame, 'source_audio_frame': source_audio_frame,
'target_vision_frame': target_vision_frame 'target_vision_frame': target_vision_frame
}) })
write_image(target_vision_path, result_frame) write_image(target_vision_path, output_vision_frame)
update_progress() update_progress()
def process_image(source_paths : List[str], target_path : str, output_path : str) -> None: def process_image(source_paths : List[str], target_path : str, output_path : str) -> None:
reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None
source_audio_path = get_first(filter_audio_paths(source_paths)) source_audio_frame = create_empty_audio_frame()
source_audio_frame = get_audio_frame(source_audio_path, 25)
target_vision_frame = read_static_image(target_path) target_vision_frame = read_static_image(target_path)
result_frame = process_frame( output_vision_frame = process_frame(
{ {
'reference_faces': reference_faces, 'reference_faces': reference_faces,
'source_audio_frame': source_audio_frame, 'source_audio_frame': source_audio_frame,
'target_vision_frame': target_vision_frame 'target_vision_frame': target_vision_frame
}) })
write_image(output_path, result_frame) write_image(output_path, output_vision_frame)
def process_video(source_paths : List[str], temp_frame_paths : List[str]) -> None: def process_video(source_paths : List[str], temp_frame_paths : List[str]) -> None:

View File

@ -2,10 +2,10 @@ from typing import Literal, TypedDict
from facefusion.typing import Face, FaceSet, AudioFrame, VisionFrame from facefusion.typing import Face, FaceSet, AudioFrame, VisionFrame
FaceDebuggerItem = Literal['bounding-box', 'landmark-5', 'landmark-68', 'face-mask', 'score', 'age', 'gender'] FaceDebuggerItem = Literal['bounding-box', 'face-landmark-5', 'face-landmark-5/68', 'face-landmark-68', 'face-mask', 'face-detector-score', 'face-landmarker-score', 'age', 'gender']
FaceEnhancerModel = Literal['codeformer', 'gfpgan_1.2', 'gfpgan_1.3', 'gfpgan_1.4', 'gpen_bfr_256', 'gpen_bfr_512', 'restoreformer_plus_plus'] FaceEnhancerModel = Literal['codeformer', 'gfpgan_1.2', 'gfpgan_1.3', 'gfpgan_1.4', 'gpen_bfr_256', 'gpen_bfr_512', 'restoreformer_plus_plus']
FaceSwapperModel = Literal['blendswap_256', 'inswapper_128', 'inswapper_128_fp16', 'simswap_256', 'simswap_512_unofficial', 'uniface_256'] FaceSwapperModel = Literal['blendswap_256', 'inswapper_128', 'inswapper_128_fp16', 'simswap_256', 'simswap_512_unofficial', 'uniface_256']
FrameEnhancerModel = Literal['real_esrgan_x2plus', 'real_esrgan_x4plus', 'real_esrnet_x4plus'] FrameEnhancerModel = Literal['lsdir_x4', 'nomos8k_sc_x4', 'real_esrgan_x4', 'real_esrgan_x4_fp16', 'span_kendata_x4']
LipSyncerModel = Literal['wav2lip_gan'] LipSyncerModel = Literal['wav2lip_gan']
FaceDebuggerInputs = TypedDict('FaceDebuggerInputs', FaceDebuggerInputs = TypedDict('FaceDebuggerInputs',

51
facefusion/statistics.py Normal file
View File

@ -0,0 +1,51 @@
from typing import Any, Dict
import numpy
import facefusion.globals
from facefusion.face_store import FACE_STORE
from facefusion.typing import FaceSet
from facefusion import logger
def create_statistics(static_faces : FaceSet) -> Dict[str, Any]:
face_detector_score_list = []
face_landmarker_score_list = []
statistics =\
{
'min_face_detector_score': 0,
'min_face_landmarker_score': 0,
'max_face_detector_score': 0,
'max_face_landmarker_score': 0,
'average_face_detector_score': 0,
'average_face_landmarker_score': 0,
'total_face_landmark_5_fallbacks': 0,
'total_frames_with_faces': 0,
'total_faces': 0
}
for faces in static_faces.values():
statistics['total_frames_with_faces'] = statistics.get('total_frames_with_faces') + 1
for face in faces:
statistics['total_faces'] = statistics.get('total_faces') + 1
face_detector_score_list.append(face.scores.get('detector'))
face_landmarker_score_list.append(face.scores.get('landmarker'))
if numpy.array_equal(face.landmarks.get('5'), face.landmarks.get('5/68')):
statistics['total_face_landmark_5_fallbacks'] = statistics.get('total_face_landmark_5_fallbacks') + 1
if face_detector_score_list:
statistics['min_face_detector_score'] = round(min(face_detector_score_list), 2)
statistics['max_face_detector_score'] = round(max(face_detector_score_list), 2)
statistics['average_face_detector_score'] = round(numpy.mean(face_detector_score_list), 2)
if face_landmarker_score_list:
statistics['min_face_landmarker_score'] = round(min(face_landmarker_score_list), 2)
statistics['max_face_landmarker_score'] = round(max(face_landmarker_score_list), 2)
statistics['average_face_landmarker_score'] = round(numpy.mean(face_landmarker_score_list), 2)
return statistics
def conditional_log_statistics() -> None:
if facefusion.globals.log_level == 'debug':
statistics = create_statistics(FACE_STORE.get('static_faces'))
for name, value in statistics.items():
logger.debug(str(name) + ': ' + str(value), __name__.upper())

View File

@ -12,12 +12,17 @@ FaceLandmarkSet = TypedDict('FaceLandmarkSet',
'68' : FaceLandmark68 # type: ignore[valid-type] '68' : FaceLandmark68 # type: ignore[valid-type]
}) })
Score = float Score = float
FaceScoreSet = TypedDict('FaceScoreSet',
{
'detector' : Score,
'landmarker' : Score
})
Embedding = numpy.ndarray[Any, Any] Embedding = numpy.ndarray[Any, Any]
Face = namedtuple('Face', Face = namedtuple('Face',
[ [
'bounding_box', 'bounding_box',
'landmark', 'landmarks',
'score', 'scores',
'embedding', 'embedding',
'normed_embedding', 'normed_embedding',
'gender', 'gender',
@ -29,6 +34,7 @@ FaceStore = TypedDict('FaceStore',
'static_faces' : FaceSet, 'static_faces' : FaceSet,
'reference_faces': FaceSet 'reference_faces': FaceSet
}) })
VisionFrame = numpy.ndarray[Any, Any] VisionFrame = numpy.ndarray[Any, Any]
Mask = numpy.ndarray[Any, Any] Mask = numpy.ndarray[Any, Any]
Matrix = numpy.ndarray[Any, Any] Matrix = numpy.ndarray[Any, Any]
@ -43,29 +49,32 @@ Fps = float
Padding = Tuple[int, int, int, int] Padding = Tuple[int, int, int, int]
Resolution = Tuple[int, int] Resolution = Tuple[int, int]
ProcessState = Literal['processing', 'stopping', 'pending']
QueuePayload = TypedDict('QueuePayload', QueuePayload = TypedDict('QueuePayload',
{ {
'frame_number' : int, 'frame_number' : int,
'frame_path' : str 'frame_path' : str
}) })
Update_Process = Callable[[], None] UpdateProcess = Callable[[], None]
Process_Frames = Callable[[List[str], List[QueuePayload], Update_Process], None] ProcessFrames = Callable[[List[str], List[QueuePayload], UpdateProcess], None]
Template = Literal['arcface_112_v1', 'arcface_112_v2', 'arcface_128_v2', 'ffhq_512'] WarpTemplate = Literal['arcface_112_v1', 'arcface_112_v2', 'arcface_128_v2', 'ffhq_512']
WarpTemplateSet = Dict[WarpTemplate, numpy.ndarray[Any, Any]]
ProcessMode = Literal['output', 'preview', 'stream'] ProcessMode = Literal['output', 'preview', 'stream']
LogLevel = Literal['error', 'warn', 'info', 'debug'] LogLevel = Literal['error', 'warn', 'info', 'debug']
VideoMemoryStrategy = Literal['strict', 'moderate', 'tolerant'] VideoMemoryStrategy = Literal['strict', 'moderate', 'tolerant']
FaceSelectorMode = Literal['reference', 'one', 'many'] FaceSelectorMode = Literal['many', 'one', 'reference']
FaceAnalyserOrder = Literal['left-right', 'right-left', 'top-bottom', 'bottom-top', 'small-large', 'large-small', 'best-worst', 'worst-best'] FaceAnalyserOrder = Literal['left-right', 'right-left', 'top-bottom', 'bottom-top', 'small-large', 'large-small', 'best-worst', 'worst-best']
FaceAnalyserAge = Literal['child', 'teen', 'adult', 'senior'] FaceAnalyserAge = Literal['child', 'teen', 'adult', 'senior']
FaceAnalyserGender = Literal['female', 'male'] FaceAnalyserGender = Literal['female', 'male']
FaceDetectorModel = Literal['retinaface', 'yoloface', 'yunet'] FaceDetectorModel = Literal['many', 'retinaface', 'scrfd', 'yoloface', 'yunet']
FaceDetectorTweak = Literal['low-luminance', 'high-luminance']
FaceRecognizerModel = Literal['arcface_blendswap', 'arcface_inswapper', 'arcface_simswap', 'arcface_uniface'] FaceRecognizerModel = Literal['arcface_blendswap', 'arcface_inswapper', 'arcface_simswap', 'arcface_uniface']
FaceMaskType = Literal['box', 'occlusion', 'region'] FaceMaskType = Literal['box', 'occlusion', 'region']
FaceMaskRegion = Literal['skin', 'left-eyebrow', 'right-eyebrow', 'left-eye', 'right-eye', 'eye-glasses', 'nose', 'mouth', 'upper-lip', 'lower-lip'] FaceMaskRegion = Literal['skin', 'left-eyebrow', 'right-eyebrow', 'left-eye', 'right-eye', 'eye-glasses', 'nose', 'mouth', 'upper-lip', 'lower-lip']
TempFrameFormat = Literal['jpg', 'png', 'bmp'] TempFrameFormat = Literal['jpg', 'png', 'bmp']
OutputVideoEncoder = Literal['libx264', 'libx265', 'libvpx-vp9', 'h264_nvenc', 'hevc_nvenc'] OutputVideoEncoder = Literal['libx264', 'libx265', 'libvpx-vp9', 'h264_nvenc', 'hevc_nvenc', 'h264_amf', 'hevc_amf']
OutputVideoPreset = Literal['ultrafast', 'superfast', 'veryfast', 'faster', 'fast', 'medium', 'slow', 'slower', 'veryslow'] OutputVideoPreset = Literal['ultrafast', 'superfast', 'veryfast', 'faster', 'fast', 'medium', 'slow', 'slower', 'veryslow']
ModelValue = Dict[str, Any] ModelValue = Dict[str, Any]
@ -74,3 +83,38 @@ OptionsWithModel = TypedDict('OptionsWithModel',
{ {
'model' : ModelValue 'model' : ModelValue
}) })
ValueAndUnit = TypedDict('ValueAndUnit',
{
'value' : str,
'unit' : str
})
ExecutionDeviceFramework = TypedDict('ExecutionDeviceFramework',
{
'name' : str,
'version' : str
})
ExecutionDeviceProduct = TypedDict('ExecutionDeviceProduct',
{
'vendor' : str,
'name' : str,
'architecture' : str,
})
ExecutionDeviceVideoMemory = TypedDict('ExecutionDeviceVideoMemory',
{
'total' : ValueAndUnit,
'free' : ValueAndUnit
})
ExecutionDeviceUtilization = TypedDict('ExecutionDeviceUtilization',
{
'gpu' : ValueAndUnit,
'memory' : ValueAndUnit
})
ExecutionDevice = TypedDict('ExecutionDevice',
{
'driver_version' : str,
'framework' : ExecutionDeviceFramework,
'product' : ExecutionDeviceProduct,
'video_memory' : ExecutionDeviceVideoMemory,
'utilization' : ExecutionDeviceUtilization
})

View File

@ -42,3 +42,17 @@
grid-template-columns: repeat(var(--grid-cols), minmax(5em, 1fr)); grid-template-columns: repeat(var(--grid-cols), minmax(5em, 1fr));
grid-template-rows: repeat(var(--grid-rows), minmax(5em, 1fr)); grid-template-rows: repeat(var(--grid-rows), minmax(5em, 1fr));
} }
:root:root:root .tab-nav > button
{
border: unset;
border-bottom: 0.125rem solid transparent;
font-size: 1.125em;
margin: 0.5rem 1rem;
padding: 0;
}
:root:root:root .tab-nav > button.selected
{
border-bottom: 0.125rem solid;
}

View File

@ -1,17 +1,16 @@
from typing import Any, Optional, List, Dict, Generator from typing import Any, Optional, List, Dict, Generator
import time from time import sleep, perf_counter
import tempfile import tempfile
import statistics import statistics
import gradio import gradio
import facefusion.globals import facefusion.globals
from facefusion import wording from facefusion import process_manager, wording
from facefusion.face_store import clear_static_faces from facefusion.face_store import clear_static_faces
from facefusion.processors.frame.core import get_frame_processors_modules from facefusion.processors.frame.core import get_frame_processors_modules
from facefusion.vision import count_video_frame_total, detect_video_resolution, detect_video_fps, pack_resolution from facefusion.vision import count_video_frame_total, detect_video_resolution, detect_video_fps, pack_resolution
from facefusion.core import conditional_process from facefusion.core import conditional_process
from facefusion.memory import limit_system_memory from facefusion.memory import limit_system_memory
from facefusion.normalizer import normalize_output_path
from facefusion.filesystem import clear_temp from facefusion.filesystem import clear_temp
from facefusion.uis.core import get_ui_component from facefusion.uis.core import get_ui_component
@ -70,6 +69,7 @@ def render() -> None:
def listen() -> None: def listen() -> None:
benchmark_runs_checkbox_group = get_ui_component('benchmark_runs_checkbox_group') benchmark_runs_checkbox_group = get_ui_component('benchmark_runs_checkbox_group')
benchmark_cycles_slider = get_ui_component('benchmark_cycles_slider') benchmark_cycles_slider = get_ui_component('benchmark_cycles_slider')
if benchmark_runs_checkbox_group and benchmark_cycles_slider: if benchmark_runs_checkbox_group and benchmark_cycles_slider:
BENCHMARK_START_BUTTON.click(start, inputs = [ benchmark_runs_checkbox_group, benchmark_cycles_slider ], outputs = BENCHMARK_RESULTS_DATAFRAME) BENCHMARK_START_BUTTON.click(start, inputs = [ benchmark_runs_checkbox_group, benchmark_cycles_slider ], outputs = BENCHMARK_RESULTS_DATAFRAME)
BENCHMARK_CLEAR_BUTTON.click(clear, outputs = BENCHMARK_RESULTS_DATAFRAME) BENCHMARK_CLEAR_BUTTON.click(clear, outputs = BENCHMARK_RESULTS_DATAFRAME)
@ -77,10 +77,13 @@ def listen() -> None:
def start(benchmark_runs : List[str], benchmark_cycles : int) -> Generator[List[Any], None, None]: def start(benchmark_runs : List[str], benchmark_cycles : int) -> Generator[List[Any], None, None]:
facefusion.globals.source_paths = [ '.assets/examples/source.jpg' ] facefusion.globals.source_paths = [ '.assets/examples/source.jpg' ]
facefusion.globals.output_path = tempfile.gettempdir()
facefusion.globals.face_landmarker_score = 0
facefusion.globals.temp_frame_format = 'bmp' facefusion.globals.temp_frame_format = 'bmp'
facefusion.globals.output_video_preset = 'ultrafast' facefusion.globals.output_video_preset = 'ultrafast'
target_paths = [ BENCHMARKS[benchmark_run] for benchmark_run in benchmark_runs if benchmark_run in BENCHMARKS ]
benchmark_results = [] benchmark_results = []
target_paths = [ BENCHMARKS[benchmark_run] for benchmark_run in benchmark_runs if benchmark_run in BENCHMARKS ]
if target_paths: if target_paths:
pre_process() pre_process()
for target_path in target_paths: for target_path in target_paths:
@ -103,16 +106,16 @@ def post_process() -> None:
def benchmark(target_path : str, benchmark_cycles : int) -> List[Any]: def benchmark(target_path : str, benchmark_cycles : int) -> List[Any]:
process_times = [] process_times = []
total_fps = 0.0 total_fps = 0.0
for index in range(benchmark_cycles):
facefusion.globals.target_path = target_path facefusion.globals.target_path = target_path
facefusion.globals.output_path = normalize_output_path(facefusion.globals.source_paths, facefusion.globals.target_path, tempfile.gettempdir())
target_video_resolution = detect_video_resolution(facefusion.globals.target_path)
facefusion.globals.output_video_resolution = pack_resolution(target_video_resolution)
facefusion.globals.output_video_fps = detect_video_fps(facefusion.globals.target_path)
video_frame_total = count_video_frame_total(facefusion.globals.target_path) video_frame_total = count_video_frame_total(facefusion.globals.target_path)
start_time = time.perf_counter() output_video_resolution = detect_video_resolution(facefusion.globals.target_path)
facefusion.globals.output_video_resolution = pack_resolution(output_video_resolution)
facefusion.globals.output_video_fps = detect_video_fps(facefusion.globals.target_path)
for index in range(benchmark_cycles):
start_time = perf_counter()
conditional_process() conditional_process()
end_time = time.perf_counter() end_time = perf_counter()
process_time = end_time - start_time process_time = end_time - start_time
total_fps += video_frame_total / process_time total_fps += video_frame_total / process_time
process_times.append(process_time) process_times.append(process_time)
@ -132,6 +135,8 @@ def benchmark(target_path : str, benchmark_cycles : int) -> List[Any]:
def clear() -> gradio.Dataframe: def clear() -> gradio.Dataframe:
while process_manager.is_processing():
sleep(0.5)
if facefusion.globals.target_path: if facefusion.globals.target_path:
clear_temp(facefusion.globals.target_path) clear_temp(facefusion.globals.target_path)
return gradio.Dataframe(value = None) return gradio.Dataframe(value = None)

View File

@ -6,7 +6,7 @@ import facefusion.globals
from facefusion import wording from facefusion import wording
from facefusion.face_analyser import clear_face_analyser from facefusion.face_analyser import clear_face_analyser
from facefusion.processors.frame.core import clear_frame_processors_modules from facefusion.processors.frame.core import clear_frame_processors_modules
from facefusion.execution_helper import encode_execution_providers, decode_execution_providers from facefusion.execution import encode_execution_providers, decode_execution_providers
EXECUTION_PROVIDERS_CHECKBOX_GROUP : Optional[gradio.CheckboxGroup] = None EXECUTION_PROVIDERS_CHECKBOX_GROUP : Optional[gradio.CheckboxGroup] = None
@ -28,7 +28,6 @@ def listen() -> None:
def update_execution_providers(execution_providers : List[str]) -> gradio.CheckboxGroup: def update_execution_providers(execution_providers : List[str]) -> gradio.CheckboxGroup:
clear_face_analyser() clear_face_analyser()
clear_frame_processors_modules() clear_frame_processors_modules()
if not execution_providers: execution_providers = execution_providers or encode_execution_providers(onnxruntime.get_available_providers())
execution_providers = encode_execution_providers(onnxruntime.get_available_providers())
facefusion.globals.execution_providers = decode_execution_providers(execution_providers) facefusion.globals.execution_providers = decode_execution_providers(execution_providers)
return gradio.CheckboxGroup(value = execution_providers) return gradio.CheckboxGroup(value = execution_providers)

View File

@ -11,18 +11,20 @@ from facefusion.uis.core import register_ui_component
FACE_ANALYSER_ORDER_DROPDOWN : Optional[gradio.Dropdown] = None FACE_ANALYSER_ORDER_DROPDOWN : Optional[gradio.Dropdown] = None
FACE_ANALYSER_AGE_DROPDOWN : Optional[gradio.Dropdown] = None FACE_ANALYSER_AGE_DROPDOWN : Optional[gradio.Dropdown] = None
FACE_ANALYSER_GENDER_DROPDOWN : Optional[gradio.Dropdown] = None FACE_ANALYSER_GENDER_DROPDOWN : Optional[gradio.Dropdown] = None
FACE_DETECTOR_MODEL_DROPDOWN : Optional[gradio.Dropdown] = None
FACE_DETECTOR_SIZE_DROPDOWN : Optional[gradio.Dropdown] = None FACE_DETECTOR_SIZE_DROPDOWN : Optional[gradio.Dropdown] = None
FACE_DETECTOR_SCORE_SLIDER : Optional[gradio.Slider] = None FACE_DETECTOR_SCORE_SLIDER : Optional[gradio.Slider] = None
FACE_DETECTOR_MODEL_DROPDOWN : Optional[gradio.Dropdown] = None FACE_LANDMARKER_SCORE_SLIDER : Optional[gradio.Slider] = None
def render() -> None: def render() -> None:
global FACE_ANALYSER_ORDER_DROPDOWN global FACE_ANALYSER_ORDER_DROPDOWN
global FACE_ANALYSER_AGE_DROPDOWN global FACE_ANALYSER_AGE_DROPDOWN
global FACE_ANALYSER_GENDER_DROPDOWN global FACE_ANALYSER_GENDER_DROPDOWN
global FACE_DETECTOR_MODEL_DROPDOWN
global FACE_DETECTOR_SIZE_DROPDOWN global FACE_DETECTOR_SIZE_DROPDOWN
global FACE_DETECTOR_SCORE_SLIDER global FACE_DETECTOR_SCORE_SLIDER
global FACE_DETECTOR_MODEL_DROPDOWN global FACE_LANDMARKER_SCORE_SLIDER
face_detector_size_dropdown_args : Dict[str, Any] =\ face_detector_size_dropdown_args : Dict[str, Any] =\
{ {
@ -53,6 +55,7 @@ def render() -> None:
value = facefusion.globals.face_detector_model value = facefusion.globals.face_detector_model
) )
FACE_DETECTOR_SIZE_DROPDOWN = gradio.Dropdown(**face_detector_size_dropdown_args) FACE_DETECTOR_SIZE_DROPDOWN = gradio.Dropdown(**face_detector_size_dropdown_args)
with gradio.Row():
FACE_DETECTOR_SCORE_SLIDER = gradio.Slider( FACE_DETECTOR_SCORE_SLIDER = gradio.Slider(
label = wording.get('uis.face_detector_score_slider'), label = wording.get('uis.face_detector_score_slider'),
value = facefusion.globals.face_detector_score, value = facefusion.globals.face_detector_score,
@ -60,12 +63,20 @@ def render() -> None:
minimum = facefusion.choices.face_detector_score_range[0], minimum = facefusion.choices.face_detector_score_range[0],
maximum = facefusion.choices.face_detector_score_range[-1] maximum = facefusion.choices.face_detector_score_range[-1]
) )
FACE_LANDMARKER_SCORE_SLIDER = gradio.Slider(
label = wording.get('uis.face_landmarker_score_slider'),
value = facefusion.globals.face_landmarker_score,
step = facefusion.choices.face_landmarker_score_range[1] - facefusion.choices.face_landmarker_score_range[0],
minimum = facefusion.choices.face_landmarker_score_range[0],
maximum = facefusion.choices.face_landmarker_score_range[-1]
)
register_ui_component('face_analyser_order_dropdown', FACE_ANALYSER_ORDER_DROPDOWN) register_ui_component('face_analyser_order_dropdown', FACE_ANALYSER_ORDER_DROPDOWN)
register_ui_component('face_analyser_age_dropdown', FACE_ANALYSER_AGE_DROPDOWN) register_ui_component('face_analyser_age_dropdown', FACE_ANALYSER_AGE_DROPDOWN)
register_ui_component('face_analyser_gender_dropdown', FACE_ANALYSER_GENDER_DROPDOWN) register_ui_component('face_analyser_gender_dropdown', FACE_ANALYSER_GENDER_DROPDOWN)
register_ui_component('face_detector_model_dropdown', FACE_DETECTOR_MODEL_DROPDOWN) register_ui_component('face_detector_model_dropdown', FACE_DETECTOR_MODEL_DROPDOWN)
register_ui_component('face_detector_size_dropdown', FACE_DETECTOR_SIZE_DROPDOWN) register_ui_component('face_detector_size_dropdown', FACE_DETECTOR_SIZE_DROPDOWN)
register_ui_component('face_detector_score_slider', FACE_DETECTOR_SCORE_SLIDER) register_ui_component('face_detector_score_slider', FACE_DETECTOR_SCORE_SLIDER)
register_ui_component('face_landmarker_score_slider', FACE_LANDMARKER_SCORE_SLIDER)
def listen() -> None: def listen() -> None:
@ -74,7 +85,8 @@ def listen() -> None:
FACE_ANALYSER_GENDER_DROPDOWN.change(update_face_analyser_gender, inputs = FACE_ANALYSER_GENDER_DROPDOWN) FACE_ANALYSER_GENDER_DROPDOWN.change(update_face_analyser_gender, inputs = FACE_ANALYSER_GENDER_DROPDOWN)
FACE_DETECTOR_MODEL_DROPDOWN.change(update_face_detector_model, inputs = FACE_DETECTOR_MODEL_DROPDOWN, outputs = FACE_DETECTOR_SIZE_DROPDOWN) FACE_DETECTOR_MODEL_DROPDOWN.change(update_face_detector_model, inputs = FACE_DETECTOR_MODEL_DROPDOWN, outputs = FACE_DETECTOR_SIZE_DROPDOWN)
FACE_DETECTOR_SIZE_DROPDOWN.change(update_face_detector_size, inputs = FACE_DETECTOR_SIZE_DROPDOWN) FACE_DETECTOR_SIZE_DROPDOWN.change(update_face_detector_size, inputs = FACE_DETECTOR_SIZE_DROPDOWN)
FACE_DETECTOR_SCORE_SLIDER.change(update_face_detector_score, inputs = FACE_DETECTOR_SCORE_SLIDER) FACE_DETECTOR_SCORE_SLIDER.release(update_face_detector_score, inputs = FACE_DETECTOR_SCORE_SLIDER)
FACE_LANDMARKER_SCORE_SLIDER.release(update_face_landmarker_score, inputs = FACE_LANDMARKER_SCORE_SLIDER)
def update_face_analyser_order(face_analyser_order : FaceAnalyserOrder) -> None: def update_face_analyser_order(face_analyser_order : FaceAnalyserOrder) -> None:
@ -91,9 +103,10 @@ def update_face_analyser_gender(face_analyser_gender : FaceAnalyserGender) -> No
def update_face_detector_model(face_detector_model : FaceDetectorModel) -> gradio.Dropdown: def update_face_detector_model(face_detector_model : FaceDetectorModel) -> gradio.Dropdown:
facefusion.globals.face_detector_model = face_detector_model facefusion.globals.face_detector_model = face_detector_model
facefusion.globals.face_detector_size = '640x640'
if facefusion.globals.face_detector_size in facefusion.choices.face_detector_set[face_detector_model]: if facefusion.globals.face_detector_size in facefusion.choices.face_detector_set[face_detector_model]:
return gradio.Dropdown(value = '640x640', choices = facefusion.choices.face_detector_set[face_detector_model]) return gradio.Dropdown(value = facefusion.globals.face_detector_size, choices = facefusion.choices.face_detector_set[face_detector_model])
return gradio.Dropdown(value = '640x640', choices = [ '640x640' ]) return gradio.Dropdown(value = facefusion.globals.face_detector_size, choices = [ facefusion.globals.face_detector_size ])
def update_face_detector_size(face_detector_size : str) -> None: def update_face_detector_size(face_detector_size : str) -> None:
@ -102,3 +115,7 @@ def update_face_detector_size(face_detector_size : str) -> None:
def update_face_detector_score(face_detector_score : float) -> None: def update_face_detector_score(face_detector_score : float) -> None:
facefusion.globals.face_detector_score = face_detector_score facefusion.globals.face_detector_score = face_detector_score
def update_face_landmarker_score(face_landmarker_score : float) -> None:
facefusion.globals.face_landmarker_score = face_landmarker_score

View File

@ -100,12 +100,10 @@ def listen() -> None:
def update_face_mask_type(face_mask_types : List[FaceMaskType]) -> Tuple[gradio.CheckboxGroup, gradio.Group, gradio.CheckboxGroup]: def update_face_mask_type(face_mask_types : List[FaceMaskType]) -> Tuple[gradio.CheckboxGroup, gradio.Group, gradio.CheckboxGroup]:
if not face_mask_types: facefusion.globals.face_mask_types = face_mask_types or facefusion.choices.face_mask_types
face_mask_types = facefusion.choices.face_mask_types
facefusion.globals.face_mask_types = face_mask_types
has_box_mask = 'box' in face_mask_types has_box_mask = 'box' in face_mask_types
has_region_mask = 'region' in face_mask_types has_region_mask = 'region' in face_mask_types
return gradio.CheckboxGroup(value = face_mask_types), gradio.Group(visible = has_box_mask), gradio.CheckboxGroup(visible = has_region_mask) return gradio.CheckboxGroup(value = facefusion.globals.face_mask_types), gradio.Group(visible = has_box_mask), gradio.CheckboxGroup(visible = has_region_mask)
def update_face_mask_blur(face_mask_blur : float) -> None: def update_face_mask_blur(face_mask_blur : float) -> None:
@ -117,7 +115,5 @@ def update_face_mask_padding(face_mask_padding_top : int, face_mask_padding_righ
def update_face_mask_regions(face_mask_regions : List[FaceMaskRegion]) -> gradio.CheckboxGroup: def update_face_mask_regions(face_mask_regions : List[FaceMaskRegion]) -> gradio.CheckboxGroup:
if not face_mask_regions: facefusion.globals.face_mask_regions = face_mask_regions or facefusion.choices.face_mask_regions
face_mask_regions = facefusion.choices.face_mask_regions return gradio.CheckboxGroup(value = facefusion.globals.face_mask_regions)
facefusion.globals.face_mask_regions = face_mask_regions
return gradio.CheckboxGroup(value = face_mask_regions)

View File

@ -85,7 +85,8 @@ def listen() -> None:
[ [
'face_detector_model_dropdown', 'face_detector_model_dropdown',
'face_detector_size_dropdown', 'face_detector_size_dropdown',
'face_detector_score_slider' 'face_detector_score_slider',
'face_landmarker_score_slider'
] ]
for component_name in change_two_component_names: for component_name in change_two_component_names:
component = get_ui_component(component_name) component = get_ui_component(component_name)
@ -98,15 +99,15 @@ def listen() -> None:
def update_face_selector_mode(face_selector_mode : FaceSelectorMode) -> Tuple[gradio.Gallery, gradio.Slider]: def update_face_selector_mode(face_selector_mode : FaceSelectorMode) -> Tuple[gradio.Gallery, gradio.Slider]:
if face_selector_mode == 'reference':
facefusion.globals.face_selector_mode = face_selector_mode
return gradio.Gallery(visible = True), gradio.Slider(visible = True)
if face_selector_mode == 'one':
facefusion.globals.face_selector_mode = face_selector_mode
return gradio.Gallery(visible = False), gradio.Slider(visible = False)
if face_selector_mode == 'many': if face_selector_mode == 'many':
facefusion.globals.face_selector_mode = face_selector_mode facefusion.globals.face_selector_mode = face_selector_mode
return gradio.Gallery(visible = False), gradio.Slider(visible = False) return gradio.Gallery(visible = False), gradio.Slider(visible = False)
if face_selector_mode == 'one':
facefusion.globals.face_selector_mode = face_selector_mode
return gradio.Gallery(visible = False), gradio.Slider(visible = False)
if face_selector_mode == 'reference':
facefusion.globals.face_selector_mode = face_selector_mode
return gradio.Gallery(visible = True), gradio.Slider(visible = True)
def clear_and_update_reference_face_position(event : gradio.SelectData) -> gradio.Gallery: def clear_and_update_reference_face_position(event : gradio.SelectData) -> gradio.Gallery:

View File

@ -32,7 +32,7 @@ def update_frame_processors(frame_processors : List[str]) -> gradio.CheckboxGrou
frame_processor_module = load_frame_processor_module(frame_processor) frame_processor_module = load_frame_processor_module(frame_processor)
if not frame_processor_module.pre_check(): if not frame_processor_module.pre_check():
return gradio.CheckboxGroup() return gradio.CheckboxGroup()
return gradio.CheckboxGroup(value = frame_processors, choices = sort_frame_processors(frame_processors)) return gradio.CheckboxGroup(value = facefusion.globals.frame_processors, choices = sort_frame_processors(facefusion.globals.frame_processors))
def sort_frame_processors(frame_processors : List[str]) -> list[str]: def sort_frame_processors(frame_processors : List[str]) -> list[str]:

View File

@ -113,7 +113,7 @@ def update_face_enhancer_model(face_enhancer_model : FaceEnhancerModel) -> gradi
face_enhancer_module.clear_frame_processor() face_enhancer_module.clear_frame_processor()
face_enhancer_module.set_options('model', face_enhancer_module.MODELS[face_enhancer_model]) face_enhancer_module.set_options('model', face_enhancer_module.MODELS[face_enhancer_model])
if face_enhancer_module.pre_check(): if face_enhancer_module.pre_check():
return gradio.Dropdown(value = face_enhancer_model) return gradio.Dropdown(value = frame_processors_globals.face_enhancer_model)
return gradio.Dropdown() return gradio.Dropdown()
@ -135,7 +135,7 @@ def update_face_swapper_model(face_swapper_model : FaceSwapperModel) -> gradio.D
face_swapper_module.clear_frame_processor() face_swapper_module.clear_frame_processor()
face_swapper_module.set_options('model', face_swapper_module.MODELS[face_swapper_model]) face_swapper_module.set_options('model', face_swapper_module.MODELS[face_swapper_model])
if face_swapper_module.pre_check(): if face_swapper_module.pre_check():
return gradio.Dropdown(value = face_swapper_model) return gradio.Dropdown(value = frame_processors_globals.face_swapper_model)
return gradio.Dropdown() return gradio.Dropdown()
@ -145,7 +145,7 @@ def update_frame_enhancer_model(frame_enhancer_model : FrameEnhancerModel) -> gr
frame_enhancer_module.clear_frame_processor() frame_enhancer_module.clear_frame_processor()
frame_enhancer_module.set_options('model', frame_enhancer_module.MODELS[frame_enhancer_model]) frame_enhancer_module.set_options('model', frame_enhancer_module.MODELS[frame_enhancer_model])
if frame_enhancer_module.pre_check(): if frame_enhancer_module.pre_check():
return gradio.Dropdown(value = frame_enhancer_model) return gradio.Dropdown(value = frame_processors_globals.frame_enhancer_model)
return gradio.Dropdown() return gradio.Dropdown()
@ -159,5 +159,5 @@ def update_lip_syncer_model(lip_syncer_model : LipSyncerModel) -> gradio.Dropdow
lip_syncer_module.clear_frame_processor() lip_syncer_module.clear_frame_processor()
lip_syncer_module.set_options('model', lip_syncer_module.MODELS[lip_syncer_model]) lip_syncer_module.set_options('model', lip_syncer_module.MODELS[lip_syncer_model])
if lip_syncer_module.pre_check(): if lip_syncer_module.pre_check():
return gradio.Dropdown(value = lip_syncer_model) return gradio.Dropdown(value = frame_processors_globals.lip_syncer_model)
return gradio.Dropdown() return gradio.Dropdown()

View File

@ -1,24 +1,27 @@
from typing import Tuple, Optional from typing import Tuple, Optional
from time import sleep
import gradio import gradio
import facefusion.globals import facefusion.globals
from facefusion import wording from facefusion import process_manager, wording
from facefusion.core import conditional_process from facefusion.core import conditional_process
from facefusion.memory import limit_system_memory from facefusion.memory import limit_system_memory
from facefusion.uis.core import get_ui_component
from facefusion.normalizer import normalize_output_path from facefusion.normalizer import normalize_output_path
from facefusion.uis.core import get_ui_component
from facefusion.filesystem import clear_temp, is_image, is_video from facefusion.filesystem import clear_temp, is_image, is_video
OUTPUT_IMAGE : Optional[gradio.Image] = None OUTPUT_IMAGE : Optional[gradio.Image] = None
OUTPUT_VIDEO : Optional[gradio.Video] = None OUTPUT_VIDEO : Optional[gradio.Video] = None
OUTPUT_START_BUTTON : Optional[gradio.Button] = None OUTPUT_START_BUTTON : Optional[gradio.Button] = None
OUTPUT_CLEAR_BUTTON : Optional[gradio.Button] = None OUTPUT_CLEAR_BUTTON : Optional[gradio.Button] = None
OUTPUT_STOP_BUTTON : Optional[gradio.Button] = None
def render() -> None: def render() -> None:
global OUTPUT_IMAGE global OUTPUT_IMAGE
global OUTPUT_VIDEO global OUTPUT_VIDEO
global OUTPUT_START_BUTTON global OUTPUT_START_BUTTON
global OUTPUT_STOP_BUTTON
global OUTPUT_CLEAR_BUTTON global OUTPUT_CLEAR_BUTTON
OUTPUT_IMAGE = gradio.Image( OUTPUT_IMAGE = gradio.Image(
@ -33,6 +36,12 @@ def render() -> None:
variant = 'primary', variant = 'primary',
size = 'sm' size = 'sm'
) )
OUTPUT_STOP_BUTTON = gradio.Button(
value = wording.get('uis.stop_button'),
variant = 'primary',
size = 'sm',
visible = False
)
OUTPUT_CLEAR_BUTTON = gradio.Button( OUTPUT_CLEAR_BUTTON = gradio.Button(
value = wording.get('uis.clear_button'), value = wording.get('uis.clear_button'),
size = 'sm' size = 'sm'
@ -42,23 +51,38 @@ def render() -> None:
def listen() -> None: def listen() -> None:
output_path_textbox = get_ui_component('output_path_textbox') output_path_textbox = get_ui_component('output_path_textbox')
if output_path_textbox: if output_path_textbox:
OUTPUT_START_BUTTON.click(start, inputs = output_path_textbox, outputs = [ OUTPUT_IMAGE, OUTPUT_VIDEO ]) OUTPUT_START_BUTTON.click(start, outputs = [ OUTPUT_START_BUTTON, OUTPUT_STOP_BUTTON ])
OUTPUT_START_BUTTON.click(process, outputs = [ OUTPUT_IMAGE, OUTPUT_VIDEO, OUTPUT_START_BUTTON, OUTPUT_STOP_BUTTON ])
OUTPUT_STOP_BUTTON.click(stop, outputs = [ OUTPUT_START_BUTTON, OUTPUT_STOP_BUTTON ])
OUTPUT_CLEAR_BUTTON.click(clear, outputs = [ OUTPUT_IMAGE, OUTPUT_VIDEO ]) OUTPUT_CLEAR_BUTTON.click(clear, outputs = [ OUTPUT_IMAGE, OUTPUT_VIDEO ])
def start(output_path : str) -> Tuple[gradio.Image, gradio.Video]: def start() -> Tuple[gradio.Button, gradio.Button]:
facefusion.globals.output_path = normalize_output_path(facefusion.globals.source_paths, facefusion.globals.target_path, output_path) while not process_manager.is_processing():
sleep(0.5)
return gradio.Button(visible = False), gradio.Button(visible = True)
def process() -> Tuple[gradio.Image, gradio.Video, gradio.Button, gradio.Button]:
normed_output_path = normalize_output_path(facefusion.globals.target_path, facefusion.globals.output_path)
if facefusion.globals.system_memory_limit > 0: if facefusion.globals.system_memory_limit > 0:
limit_system_memory(facefusion.globals.system_memory_limit) limit_system_memory(facefusion.globals.system_memory_limit)
conditional_process() conditional_process()
if is_image(facefusion.globals.output_path): if is_image(normed_output_path):
return gradio.Image(value = facefusion.globals.output_path, visible = True), gradio.Video(value = None, visible = False) return gradio.Image(value = normed_output_path, visible = True), gradio.Video(value = None, visible = False), gradio.Button(visible = True), gradio.Button(visible = False)
if is_video(facefusion.globals.output_path): if is_video(normed_output_path):
return gradio.Image(value = None, visible = False), gradio.Video(value = facefusion.globals.output_path, visible = True) return gradio.Image(value = None, visible = False), gradio.Video(value = normed_output_path, visible = True), gradio.Button(visible = True), gradio.Button(visible = False)
return gradio.Image(), gradio.Video() return gradio.Image(value = None), gradio.Video(value = None), gradio.Button(visible = True), gradio.Button(visible = False)
def stop() -> Tuple[gradio.Button, gradio.Button]:
process_manager.stop()
return gradio.Button(visible = True), gradio.Button(visible = False)
def clear() -> Tuple[gradio.Image, gradio.Video]: def clear() -> Tuple[gradio.Image, gradio.Video]:
while process_manager.is_processing():
sleep(0.5)
if facefusion.globals.target_path: if facefusion.globals.target_path:
clear_temp(facefusion.globals.target_path) clear_temp(facefusion.globals.target_path)
return gradio.Image(value = None), gradio.Video(value = None) return gradio.Image(value = None), gradio.Video(value = None)

View File

@ -1,5 +1,4 @@
from typing import Optional, Tuple, List from typing import Optional, Tuple, List
import tempfile
import gradio import gradio
import facefusion.globals import facefusion.globals
@ -9,10 +8,11 @@ from facefusion.typing import OutputVideoEncoder, OutputVideoPreset, Fps
from facefusion.filesystem import is_image, is_video from facefusion.filesystem import is_image, is_video
from facefusion.uis.typing import ComponentName from facefusion.uis.typing import ComponentName
from facefusion.uis.core import get_ui_component, register_ui_component from facefusion.uis.core import get_ui_component, register_ui_component
from facefusion.vision import detect_video_fps, create_video_resolutions, detect_video_resolution, pack_resolution from facefusion.vision import detect_image_resolution, create_image_resolutions, detect_video_fps, detect_video_resolution, create_video_resolutions, pack_resolution
OUTPUT_PATH_TEXTBOX : Optional[gradio.Textbox] = None OUTPUT_PATH_TEXTBOX : Optional[gradio.Textbox] = None
OUTPUT_IMAGE_QUALITY_SLIDER : Optional[gradio.Slider] = None OUTPUT_IMAGE_QUALITY_SLIDER : Optional[gradio.Slider] = None
OUTPUT_IMAGE_RESOLUTION_DROPDOWN : Optional[gradio.Dropdown] = None
OUTPUT_VIDEO_ENCODER_DROPDOWN : Optional[gradio.Dropdown] = None OUTPUT_VIDEO_ENCODER_DROPDOWN : Optional[gradio.Dropdown] = None
OUTPUT_VIDEO_PRESET_DROPDOWN : Optional[gradio.Dropdown] = None OUTPUT_VIDEO_PRESET_DROPDOWN : Optional[gradio.Dropdown] = None
OUTPUT_VIDEO_RESOLUTION_DROPDOWN : Optional[gradio.Dropdown] = None OUTPUT_VIDEO_RESOLUTION_DROPDOWN : Optional[gradio.Dropdown] = None
@ -23,15 +23,25 @@ OUTPUT_VIDEO_FPS_SLIDER : Optional[gradio.Slider] = None
def render() -> None: def render() -> None:
global OUTPUT_PATH_TEXTBOX global OUTPUT_PATH_TEXTBOX
global OUTPUT_IMAGE_QUALITY_SLIDER global OUTPUT_IMAGE_QUALITY_SLIDER
global OUTPUT_IMAGE_RESOLUTION_DROPDOWN
global OUTPUT_VIDEO_ENCODER_DROPDOWN global OUTPUT_VIDEO_ENCODER_DROPDOWN
global OUTPUT_VIDEO_PRESET_DROPDOWN global OUTPUT_VIDEO_PRESET_DROPDOWN
global OUTPUT_VIDEO_RESOLUTION_DROPDOWN global OUTPUT_VIDEO_RESOLUTION_DROPDOWN
global OUTPUT_VIDEO_QUALITY_SLIDER global OUTPUT_VIDEO_QUALITY_SLIDER
global OUTPUT_VIDEO_FPS_SLIDER global OUTPUT_VIDEO_FPS_SLIDER
output_image_resolutions = []
output_video_resolutions = []
if is_image(facefusion.globals.target_path):
output_image_resolution = detect_image_resolution(facefusion.globals.target_path)
output_image_resolutions = create_image_resolutions(output_image_resolution)
if is_video(facefusion.globals.target_path):
output_video_resolution = detect_video_resolution(facefusion.globals.target_path)
output_video_resolutions = create_video_resolutions(output_video_resolution)
facefusion.globals.output_path = facefusion.globals.output_path or '.'
OUTPUT_PATH_TEXTBOX = gradio.Textbox( OUTPUT_PATH_TEXTBOX = gradio.Textbox(
label = wording.get('uis.output_path_textbox'), label = wording.get('uis.output_path_textbox'),
value = facefusion.globals.output_path or tempfile.gettempdir(), value = facefusion.globals.output_path,
max_lines = 1 max_lines = 1
) )
OUTPUT_IMAGE_QUALITY_SLIDER = gradio.Slider( OUTPUT_IMAGE_QUALITY_SLIDER = gradio.Slider(
@ -42,6 +52,12 @@ def render() -> None:
maximum = facefusion.choices.output_image_quality_range[-1], maximum = facefusion.choices.output_image_quality_range[-1],
visible = is_image(facefusion.globals.target_path) visible = is_image(facefusion.globals.target_path)
) )
OUTPUT_IMAGE_RESOLUTION_DROPDOWN = gradio.Dropdown(
label = wording.get('uis.output_image_resolution_dropdown'),
choices = output_image_resolutions,
value = facefusion.globals.output_image_resolution,
visible = is_image(facefusion.globals.target_path)
)
OUTPUT_VIDEO_ENCODER_DROPDOWN = gradio.Dropdown( OUTPUT_VIDEO_ENCODER_DROPDOWN = gradio.Dropdown(
label = wording.get('uis.output_video_encoder_dropdown'), label = wording.get('uis.output_video_encoder_dropdown'),
choices = facefusion.choices.output_video_encoders, choices = facefusion.choices.output_video_encoders,
@ -64,7 +80,7 @@ def render() -> None:
) )
OUTPUT_VIDEO_RESOLUTION_DROPDOWN = gradio.Dropdown( OUTPUT_VIDEO_RESOLUTION_DROPDOWN = gradio.Dropdown(
label = wording.get('uis.output_video_resolution_dropdown'), label = wording.get('uis.output_video_resolution_dropdown'),
choices = create_video_resolutions(facefusion.globals.target_path), choices = output_video_resolutions,
value = facefusion.globals.output_video_resolution, value = facefusion.globals.output_video_resolution,
visible = is_video(facefusion.globals.target_path) visible = is_video(facefusion.globals.target_path)
) )
@ -83,6 +99,7 @@ def render() -> None:
def listen() -> None: def listen() -> None:
OUTPUT_PATH_TEXTBOX.change(update_output_path, inputs = OUTPUT_PATH_TEXTBOX) OUTPUT_PATH_TEXTBOX.change(update_output_path, inputs = OUTPUT_PATH_TEXTBOX)
OUTPUT_IMAGE_QUALITY_SLIDER.change(update_output_image_quality, inputs = OUTPUT_IMAGE_QUALITY_SLIDER) OUTPUT_IMAGE_QUALITY_SLIDER.change(update_output_image_quality, inputs = OUTPUT_IMAGE_QUALITY_SLIDER)
OUTPUT_IMAGE_RESOLUTION_DROPDOWN.change(update_output_image_resolution, inputs = OUTPUT_IMAGE_RESOLUTION_DROPDOWN)
OUTPUT_VIDEO_ENCODER_DROPDOWN.change(update_output_video_encoder, inputs = OUTPUT_VIDEO_ENCODER_DROPDOWN) OUTPUT_VIDEO_ENCODER_DROPDOWN.change(update_output_video_encoder, inputs = OUTPUT_VIDEO_ENCODER_DROPDOWN)
OUTPUT_VIDEO_PRESET_DROPDOWN.change(update_output_video_preset, inputs = OUTPUT_VIDEO_PRESET_DROPDOWN) OUTPUT_VIDEO_PRESET_DROPDOWN.change(update_output_video_preset, inputs = OUTPUT_VIDEO_PRESET_DROPDOWN)
OUTPUT_VIDEO_QUALITY_SLIDER.change(update_output_video_quality, inputs = OUTPUT_VIDEO_QUALITY_SLIDER) OUTPUT_VIDEO_QUALITY_SLIDER.change(update_output_video_quality, inputs = OUTPUT_VIDEO_QUALITY_SLIDER)
@ -97,19 +114,22 @@ def listen() -> None:
component = get_ui_component(component_name) component = get_ui_component(component_name)
if component: if component:
for method in [ 'upload', 'change', 'clear' ]: for method in [ 'upload', 'change', 'clear' ]:
getattr(component, method)(remote_update, outputs = [ OUTPUT_IMAGE_QUALITY_SLIDER, OUTPUT_VIDEO_ENCODER_DROPDOWN, OUTPUT_VIDEO_PRESET_DROPDOWN, OUTPUT_VIDEO_QUALITY_SLIDER, OUTPUT_VIDEO_RESOLUTION_DROPDOWN, OUTPUT_VIDEO_FPS_SLIDER ]) getattr(component, method)(remote_update, outputs = [ OUTPUT_IMAGE_QUALITY_SLIDER, OUTPUT_IMAGE_RESOLUTION_DROPDOWN, OUTPUT_VIDEO_ENCODER_DROPDOWN, OUTPUT_VIDEO_PRESET_DROPDOWN, OUTPUT_VIDEO_QUALITY_SLIDER, OUTPUT_VIDEO_RESOLUTION_DROPDOWN, OUTPUT_VIDEO_FPS_SLIDER ])
def remote_update() -> Tuple[gradio.Slider, gradio.Dropdown, gradio.Dropdown, gradio.Slider, gradio.Dropdown, gradio.Slider]: def remote_update() -> Tuple[gradio.Slider, gradio.Dropdown, gradio.Dropdown, gradio.Dropdown, gradio.Slider, gradio.Dropdown, gradio.Slider]:
if is_image(facefusion.globals.target_path): if is_image(facefusion.globals.target_path):
return gradio.Slider(visible = True), gradio.Dropdown(visible = False), gradio.Dropdown(visible = False), gradio.Slider(visible = False), gradio.Dropdown(visible = False, value = None, choices = None), gradio.Slider(visible = False, value = None) output_image_resolution = detect_image_resolution(facefusion.globals.target_path)
output_image_resolutions = create_image_resolutions(output_image_resolution)
facefusion.globals.output_image_resolution = pack_resolution(output_image_resolution)
return gradio.Slider(visible = True), gradio.Dropdown(visible = True, value = facefusion.globals.output_image_resolution, choices = output_image_resolutions), gradio.Dropdown(visible = False), gradio.Dropdown(visible = False), gradio.Slider(visible = False), gradio.Dropdown(visible = False, value = None, choices = None), gradio.Slider(visible = False, value = None)
if is_video(facefusion.globals.target_path): if is_video(facefusion.globals.target_path):
target_video_resolution = detect_video_resolution(facefusion.globals.target_path) output_video_resolution = detect_video_resolution(facefusion.globals.target_path)
output_video_resolution = pack_resolution(target_video_resolution) output_video_resolutions = create_video_resolutions(output_video_resolution)
output_video_resolutions = create_video_resolutions(facefusion.globals.target_path) facefusion.globals.output_video_resolution = pack_resolution(output_video_resolution)
output_video_fps = detect_video_fps(facefusion.globals.target_path) facefusion.globals.output_video_fps = detect_video_fps(facefusion.globals.target_path)
return gradio.Slider(visible = False), gradio.Dropdown(visible = True), gradio.Dropdown(visible = True), gradio.Slider(visible = True), gradio.Dropdown(visible = True, value = output_video_resolution, choices = output_video_resolutions), gradio.Slider(visible = True, value = output_video_fps) return gradio.Slider(visible = False), gradio.Dropdown(visible = False), gradio.Dropdown(visible = True), gradio.Dropdown(visible = True), gradio.Slider(visible = True), gradio.Dropdown(visible = True, value = facefusion.globals.output_video_resolution, choices = output_video_resolutions), gradio.Slider(visible = True, value = facefusion.globals.output_video_fps)
return gradio.Slider(visible = False), gradio.Dropdown(visible = False), gradio.Dropdown(visible = False), gradio.Slider(visible = False), gradio.Dropdown(visible = False, value = None, choices = None), gradio.Slider(visible = False, value = None) return gradio.Slider(visible = False), gradio.Dropdown(visible = False, value = None, choices = None), gradio.Dropdown(visible = False), gradio.Dropdown(visible = False), gradio.Slider(visible = False), gradio.Dropdown(visible = False, value = None, choices = None), gradio.Slider(visible = False, value = None)
def update_output_path(output_path : str) -> None: def update_output_path(output_path : str) -> None:
@ -120,6 +140,10 @@ def update_output_image_quality(output_image_quality : int) -> None:
facefusion.globals.output_image_quality = output_image_quality facefusion.globals.output_image_quality = output_image_quality
def update_output_image_resolution(output_image_resolution : str) -> None:
facefusion.globals.output_image_resolution = output_image_resolution
def update_output_video_encoder(output_video_encoder: OutputVideoEncoder) -> None: def update_output_video_encoder(output_video_encoder: OutputVideoEncoder) -> None:
facefusion.globals.output_video_encoder = output_video_encoder facefusion.globals.output_video_encoder = output_video_encoder

View File

@ -2,10 +2,11 @@ from typing import Any, Dict, List, Optional
from time import sleep from time import sleep
import cv2 import cv2
import gradio import gradio
import numpy
import facefusion.globals import facefusion.globals
from facefusion import wording, logger from facefusion import wording, logger
from facefusion.audio import get_audio_frame from facefusion.audio import get_audio_frame, create_empty_audio_frame
from facefusion.common_helper import get_first from facefusion.common_helper import get_first
from facefusion.core import conditional_append_reference_faces from facefusion.core import conditional_append_reference_faces
from facefusion.face_analyser import get_average_face, clear_face_analyser from facefusion.face_analyser import get_average_face, clear_face_analyser
@ -46,6 +47,8 @@ def render() -> None:
source_audio_path = get_first(filter_audio_paths(facefusion.globals.source_paths)) source_audio_path = get_first(filter_audio_paths(facefusion.globals.source_paths))
if source_audio_path and facefusion.globals.output_video_fps: if source_audio_path and facefusion.globals.output_video_fps:
source_audio_frame = get_audio_frame(source_audio_path, facefusion.globals.output_video_fps, facefusion.globals.reference_frame_number) source_audio_frame = get_audio_frame(source_audio_path, facefusion.globals.output_video_fps, facefusion.globals.reference_frame_number)
if not numpy.any(source_audio_frame):
source_audio_frame = create_empty_audio_frame()
else: else:
source_audio_frame = None source_audio_frame = None
if is_image(facefusion.globals.target_path): if is_image(facefusion.globals.target_path):
@ -97,6 +100,8 @@ def listen() -> None:
'face_debugger_items_checkbox_group', 'face_debugger_items_checkbox_group',
'face_enhancer_blend_slider', 'face_enhancer_blend_slider',
'frame_enhancer_blend_slider', 'frame_enhancer_blend_slider',
'trim_frame_start_slider',
'trim_frame_end_slider',
'face_selector_mode_dropdown', 'face_selector_mode_dropdown',
'reference_face_distance_slider', 'reference_face_distance_slider',
'face_mask_types_checkbox_group', 'face_mask_types_checkbox_group',
@ -124,7 +129,8 @@ def listen() -> None:
'lip_syncer_model_dropdown', 'lip_syncer_model_dropdown',
'face_detector_model_dropdown', 'face_detector_model_dropdown',
'face_detector_size_dropdown', 'face_detector_size_dropdown',
'face_detector_score_slider' 'face_detector_score_slider',
'face_landmarker_score_slider'
] ]
for component_name in change_two_component_names: for component_name in change_two_component_names:
component = get_ui_component(component_name) component = get_ui_component(component_name)
@ -153,10 +159,14 @@ def update_preview_image(frame_number : int = 0) -> gradio.Image:
source_face = get_average_face(source_frames) source_face = get_average_face(source_frames)
source_audio_path = get_first(filter_audio_paths(facefusion.globals.source_paths)) source_audio_path = get_first(filter_audio_paths(facefusion.globals.source_paths))
if source_audio_path and facefusion.globals.output_video_fps: if source_audio_path and facefusion.globals.output_video_fps:
source_audio_frame = get_audio_frame(source_audio_path, facefusion.globals.output_video_fps, facefusion.globals.reference_frame_number) reference_audio_frame_number = facefusion.globals.reference_frame_number
if facefusion.globals.trim_frame_start:
reference_audio_frame_number -= facefusion.globals.trim_frame_start
source_audio_frame = get_audio_frame(source_audio_path, facefusion.globals.output_video_fps, reference_audio_frame_number)
if not numpy.any(source_audio_frame):
source_audio_frame = create_empty_audio_frame()
else: else:
source_audio_frame = None source_audio_frame = None
if is_image(facefusion.globals.target_path): if is_image(facefusion.globals.target_path):
target_vision_frame = read_static_image(facefusion.globals.target_path) target_vision_frame = read_static_image(facefusion.globals.target_path)
preview_vision_frame = process_preview_frame(reference_faces, source_face, source_audio_frame, target_vision_frame) preview_vision_frame = process_preview_frame(reference_faces, source_face, source_audio_frame, target_vision_frame)
@ -178,7 +188,7 @@ def update_preview_frame_slider() -> gradio.Slider:
def process_preview_frame(reference_faces : FaceSet, source_face : Face, source_audio_frame : AudioFrame, target_vision_frame : VisionFrame) -> VisionFrame: def process_preview_frame(reference_faces : FaceSet, source_face : Face, source_audio_frame : AudioFrame, target_vision_frame : VisionFrame) -> VisionFrame:
target_vision_frame = resize_frame_resolution(target_vision_frame, 640, 640) target_vision_frame = resize_frame_resolution(target_vision_frame, (640, 640))
if analyse_frame(target_vision_frame): if analyse_frame(target_vision_frame):
return cv2.GaussianBlur(target_vision_frame, (99, 99), 0) return cv2.GaussianBlur(target_vision_frame, (99, 99), 0)
for frame_processor in facefusion.globals.frame_processors: for frame_processor in facefusion.globals.frame_processors:

View File

@ -1,4 +1,4 @@
from typing import Optional, Tuple from typing import Optional
import gradio import gradio
import facefusion.globals import facefusion.globals
@ -9,12 +9,10 @@ from facefusion.filesystem import is_video
from facefusion.uis.core import get_ui_component from facefusion.uis.core import get_ui_component
TEMP_FRAME_FORMAT_DROPDOWN : Optional[gradio.Dropdown] = None TEMP_FRAME_FORMAT_DROPDOWN : Optional[gradio.Dropdown] = None
TEMP_FRAME_QUALITY_SLIDER : Optional[gradio.Slider] = None
def render() -> None: def render() -> None:
global TEMP_FRAME_FORMAT_DROPDOWN global TEMP_FRAME_FORMAT_DROPDOWN
global TEMP_FRAME_QUALITY_SLIDER
TEMP_FRAME_FORMAT_DROPDOWN = gradio.Dropdown( TEMP_FRAME_FORMAT_DROPDOWN = gradio.Dropdown(
label = wording.get('uis.temp_frame_format_dropdown'), label = wording.get('uis.temp_frame_format_dropdown'),
@ -22,34 +20,22 @@ def render() -> None:
value = facefusion.globals.temp_frame_format, value = facefusion.globals.temp_frame_format,
visible = is_video(facefusion.globals.target_path) visible = is_video(facefusion.globals.target_path)
) )
TEMP_FRAME_QUALITY_SLIDER = gradio.Slider(
label = wording.get('uis.temp_frame_quality_slider'),
value = facefusion.globals.temp_frame_quality,
step = facefusion.choices.temp_frame_quality_range[1] - facefusion.choices.temp_frame_quality_range[0],
minimum = facefusion.choices.temp_frame_quality_range[0],
maximum = facefusion.choices.temp_frame_quality_range[-1],
visible = is_video(facefusion.globals.target_path)
)
def listen() -> None: def listen() -> None:
TEMP_FRAME_FORMAT_DROPDOWN.change(update_temp_frame_format, inputs = TEMP_FRAME_FORMAT_DROPDOWN) TEMP_FRAME_FORMAT_DROPDOWN.change(update_temp_frame_format, inputs = TEMP_FRAME_FORMAT_DROPDOWN)
TEMP_FRAME_QUALITY_SLIDER.change(update_temp_frame_quality, inputs = TEMP_FRAME_QUALITY_SLIDER)
target_video = get_ui_component('target_video') target_video = get_ui_component('target_video')
if target_video: if target_video:
for method in [ 'upload', 'change', 'clear' ]: for method in [ 'upload', 'change', 'clear' ]:
getattr(target_video, method)(remote_update, outputs = [ TEMP_FRAME_FORMAT_DROPDOWN, TEMP_FRAME_QUALITY_SLIDER ]) getattr(target_video, method)(remote_update, outputs = TEMP_FRAME_FORMAT_DROPDOWN)
def remote_update() -> Tuple[gradio.Dropdown, gradio.Slider]: def remote_update() -> gradio.Dropdown:
if is_video(facefusion.globals.target_path): if is_video(facefusion.globals.target_path):
return gradio.Dropdown(visible = True), gradio.Slider(visible = True) return gradio.Dropdown(visible = True)
return gradio.Dropdown(visible = False), gradio.Slider(visible = False) return gradio.Dropdown(visible = False)
def update_temp_frame_format(temp_frame_format : TempFrameFormat) -> None: def update_temp_frame_format(temp_frame_format : TempFrameFormat) -> None:
facefusion.globals.temp_frame_format = temp_frame_format facefusion.globals.temp_frame_format = temp_frame_format
def update_temp_frame_quality(temp_frame_quality : int) -> None:
facefusion.globals.temp_frame_quality = temp_frame_quality

View File

@ -5,7 +5,7 @@ import facefusion.globals
from facefusion import wording from facefusion import wording
from facefusion.vision import count_video_frame_total from facefusion.vision import count_video_frame_total
from facefusion.filesystem import is_video from facefusion.filesystem import is_video
from facefusion.uis.core import get_ui_component from facefusion.uis.core import get_ui_component, register_ui_component
TRIM_FRAME_START_SLIDER : Optional[gradio.Slider] = None TRIM_FRAME_START_SLIDER : Optional[gradio.Slider] = None
TRIM_FRAME_END_SLIDER : Optional[gradio.Slider] = None TRIM_FRAME_END_SLIDER : Optional[gradio.Slider] = None
@ -42,6 +42,8 @@ def render() -> None:
with gradio.Row(): with gradio.Row():
TRIM_FRAME_START_SLIDER = gradio.Slider(**trim_frame_start_slider_args) TRIM_FRAME_START_SLIDER = gradio.Slider(**trim_frame_start_slider_args)
TRIM_FRAME_END_SLIDER = gradio.Slider(**trim_frame_end_slider_args) TRIM_FRAME_END_SLIDER = gradio.Slider(**trim_frame_end_slider_args)
register_ui_component('trim_frame_start_slider', TRIM_FRAME_START_SLIDER)
register_ui_component('trim_frame_end_slider', TRIM_FRAME_END_SLIDER)
def listen() -> None: def listen() -> None:

View File

@ -11,7 +11,9 @@ from tqdm import tqdm
import facefusion.globals import facefusion.globals
from facefusion import logger, wording from facefusion import logger, wording
from facefusion.audio import create_empty_audio_frame
from facefusion.content_analyser import analyse_stream from facefusion.content_analyser import analyse_stream
from facefusion.filesystem import filter_image_paths
from facefusion.typing import VisionFrame, Face, Fps from facefusion.typing import VisionFrame, Face, Fps
from facefusion.face_analyser import get_average_face from facefusion.face_analyser import get_average_face
from facefusion.processors.frame.core import get_frame_processors_modules, load_frame_processor_module from facefusion.processors.frame.core import get_frame_processors_modules, load_frame_processor_module
@ -92,9 +94,11 @@ def listen() -> None:
def start(webcam_mode : WebcamMode, webcam_resolution : str, webcam_fps : Fps) -> Generator[VisionFrame, None, None]: def start(webcam_mode : WebcamMode, webcam_resolution : str, webcam_fps : Fps) -> Generator[VisionFrame, None, None]:
facefusion.globals.face_selector_mode = 'one' facefusion.globals.face_selector_mode = 'one'
facefusion.globals.face_analyser_order = 'large-small' facefusion.globals.face_analyser_order = 'large-small'
source_frames = read_static_images(facefusion.globals.source_paths) source_image_paths = filter_image_paths(facefusion.globals.source_paths)
source_frames = read_static_images(source_image_paths)
source_face = get_average_face(source_frames) source_face = get_average_face(source_frames)
stream = None stream = None
if webcam_mode in [ 'udp', 'v4l2' ]: if webcam_mode in [ 'udp', 'v4l2' ]:
stream = open_stream(webcam_mode, webcam_resolution, webcam_fps) # type: ignore[arg-type] stream = open_stream(webcam_mode, webcam_resolution, webcam_fps) # type: ignore[arg-type]
webcam_width, webcam_height = unpack_resolution(webcam_resolution) webcam_width, webcam_height = unpack_resolution(webcam_resolution)
@ -150,6 +154,7 @@ def stop() -> gradio.Image:
def process_stream_frame(source_face : Face, target_vision_frame : VisionFrame) -> VisionFrame: def process_stream_frame(source_face : Face, target_vision_frame : VisionFrame) -> VisionFrame:
source_audio_frame = create_empty_audio_frame()
for frame_processor_module in get_frame_processors_modules(facefusion.globals.frame_processors): for frame_processor_module in get_frame_processors_modules(facefusion.globals.frame_processors):
logger.disable() logger.disable()
if frame_processor_module.pre_process('stream'): if frame_processor_module.pre_process('stream'):
@ -157,8 +162,7 @@ def process_stream_frame(source_face : Face, target_vision_frame : VisionFrame)
target_vision_frame = frame_processor_module.process_frame( target_vision_frame = frame_processor_module.process_frame(
{ {
'source_face': source_face, 'source_face': source_face,
'reference_faces': None, 'source_audio_frame': source_audio_frame,
'source_audio_frame': None,
'target_vision_frame': target_vision_frame 'target_vision_frame': target_vision_frame
}) })
return target_vision_frame return target_vision_frame

View File

@ -58,10 +58,16 @@ def register_ui_component(name : ComponentName, component: Component) -> None:
def launch() -> None: def launch() -> None:
ui_layouts_total = len(facefusion.globals.ui_layouts)
with gradio.Blocks(theme = get_theme(), css = get_css(), title = metadata.get('name') + ' ' + metadata.get('version')) as ui: with gradio.Blocks(theme = get_theme(), css = get_css(), title = metadata.get('name') + ' ' + metadata.get('version')) as ui:
for ui_layout in facefusion.globals.ui_layouts: for ui_layout in facefusion.globals.ui_layouts:
ui_layout_module = load_ui_layout_module(ui_layout) ui_layout_module = load_ui_layout_module(ui_layout)
if ui_layout_module.pre_render(): if ui_layout_module.pre_render():
if ui_layouts_total > 1:
with gradio.Tab(ui_layout):
ui_layout_module.render()
ui_layout_module.listen()
else:
ui_layout_module.render() ui_layout_module.render()
ui_layout_module.listen() ui_layout_module.listen()

View File

@ -75,4 +75,4 @@ def listen() -> None:
def run(ui : gradio.Blocks) -> None: def run(ui : gradio.Blocks) -> None:
ui.launch(show_api = False, quiet = True) ui.queue(concurrency_count = 4).launch(show_api = False, quiet = True)

View File

@ -10,6 +10,8 @@ ComponentName = Literal\
'target_image', 'target_image',
'target_video', 'target_video',
'preview_frame_slider', 'preview_frame_slider',
'trim_frame_start_slider',
'trim_frame_end_slider',
'face_selector_mode_dropdown', 'face_selector_mode_dropdown',
'reference_face_position_gallery', 'reference_face_position_gallery',
'reference_face_distance_slider', 'reference_face_distance_slider',
@ -19,6 +21,7 @@ ComponentName = Literal\
'face_detector_model_dropdown', 'face_detector_model_dropdown',
'face_detector_size_dropdown', 'face_detector_size_dropdown',
'face_detector_score_slider', 'face_detector_score_slider',
'face_landmarker_score_slider',
'face_mask_types_checkbox_group', 'face_mask_types_checkbox_group',
'face_mask_blur_slider', 'face_mask_blur_slider',
'face_mask_padding_top_slider', 'face_mask_padding_top_slider',

View File

@ -1,12 +1,55 @@
from typing import Optional, List, Tuple from typing import Optional, List, Tuple
from functools import lru_cache from functools import lru_cache
import cv2 import cv2
import numpy
from cv2.typing import Size
from facefusion.typing import VisionFrame, Resolution from facefusion.typing import VisionFrame, Resolution, Fps
from facefusion.choices import video_template_sizes from facefusion.choices import image_template_sizes, video_template_sizes
from facefusion.filesystem import is_image, is_video from facefusion.filesystem import is_image, is_video
@lru_cache(maxsize = 128)
def read_static_image(image_path : str) -> Optional[VisionFrame]:
return read_image(image_path)
def read_static_images(image_paths : List[str]) -> Optional[List[VisionFrame]]:
frames = []
if image_paths:
for image_path in image_paths:
frames.append(read_static_image(image_path))
return frames
def read_image(image_path : str) -> Optional[VisionFrame]:
if is_image(image_path):
return cv2.imread(image_path)
return None
def write_image(image_path : str, vision_frame : VisionFrame) -> bool:
if image_path:
return cv2.imwrite(image_path, vision_frame)
return False
def detect_image_resolution(image_path : str) -> Optional[Resolution]:
if is_image(image_path):
image = read_image(image_path)
height, width = image.shape[:2]
return width, height
return None
def restrict_image_resolution(image_path : str, resolution : Resolution) -> Resolution:
if is_image(image_path):
image_resolution = detect_image_resolution(image_path)
if image_resolution < resolution:
return image_resolution
return resolution
def get_video_frame(video_path : str, frame_number : int = 0) -> Optional[VisionFrame]: def get_video_frame(video_path : str, frame_number : int = 0) -> Optional[VisionFrame]:
if is_video(video_path): if is_video(video_path):
video_capture = cv2.VideoCapture(video_path) video_capture = cv2.VideoCapture(video_path)
@ -20,6 +63,21 @@ def get_video_frame(video_path : str, frame_number : int = 0) -> Optional[Vision
return None return None
def create_image_resolutions(resolution : Resolution) -> List[str]:
resolutions = []
temp_resolutions = []
if resolution:
width, height = resolution
temp_resolutions.append(normalize_resolution(resolution))
for template_size in image_template_sizes:
temp_resolutions.append(normalize_resolution((width * template_size, height * template_size)))
temp_resolutions = sorted(set(temp_resolutions))
for temp_resolution in temp_resolutions:
resolutions.append(pack_resolution(temp_resolution))
return resolutions
def count_video_frame_total(video_path : str) -> int: def count_video_frame_total(video_path : str) -> int:
if is_video(video_path): if is_video(video_path):
video_capture = cv2.VideoCapture(video_path) video_capture = cv2.VideoCapture(video_path)
@ -40,35 +98,49 @@ def detect_video_fps(video_path : str) -> Optional[float]:
return None return None
def detect_video_resolution(video_path : str) -> Optional[Tuple[float, float]]: def restrict_video_fps(video_path : str, fps : Fps) -> Fps:
if is_video(video_path):
video_fps = detect_video_fps(video_path)
if video_fps < fps:
return video_fps
return fps
def detect_video_resolution(video_path : str) -> Optional[Resolution]:
if is_video(video_path): if is_video(video_path):
video_capture = cv2.VideoCapture(video_path) video_capture = cv2.VideoCapture(video_path)
if video_capture.isOpened(): if video_capture.isOpened():
width = video_capture.get(cv2.CAP_PROP_FRAME_WIDTH) width = video_capture.get(cv2.CAP_PROP_FRAME_WIDTH)
height = video_capture.get(cv2.CAP_PROP_FRAME_HEIGHT) height = video_capture.get(cv2.CAP_PROP_FRAME_HEIGHT)
video_capture.release() video_capture.release()
return width, height return int(width), int(height)
return None return None
def create_video_resolutions(video_path : str) -> Optional[List[str]]: def restrict_video_resolution(video_path : str, resolution : Resolution) -> Resolution:
temp_resolutions = [] if is_video(video_path):
video_resolutions = []
video_resolution = detect_video_resolution(video_path) video_resolution = detect_video_resolution(video_path)
if video_resolution < resolution:
return video_resolution
return resolution
if video_resolution:
width, height = video_resolution def create_video_resolutions(resolution : Resolution) -> List[str]:
temp_resolutions.append(normalize_resolution(video_resolution)) resolutions = []
temp_resolutions = []
if resolution:
width, height = resolution
temp_resolutions.append(normalize_resolution(resolution))
for template_size in video_template_sizes: for template_size in video_template_sizes:
if width > height: if width > height:
temp_resolutions.append(normalize_resolution((template_size * width / height, template_size))) temp_resolutions.append(normalize_resolution((template_size * width / height, template_size)))
else: else:
temp_resolutions.append(normalize_resolution((template_size, template_size * height / width))) temp_resolutions.append(normalize_resolution((template_size, template_size * height / width)))
temp_resolutions = sorted(set(temp_resolutions)) temp_resolutions = sorted(set(temp_resolutions))
for temp in temp_resolutions: for temp_resolution in temp_resolutions:
video_resolutions.append(pack_resolution(temp)) resolutions.append(pack_resolution(temp_resolution))
return video_resolutions return resolutions
return None
def normalize_resolution(resolution : Tuple[float, float]) -> Resolution: def normalize_resolution(resolution : Tuple[float, float]) -> Resolution:
@ -81,7 +153,7 @@ def normalize_resolution(resolution : Tuple[float, float]) -> Resolution:
return 0, 0 return 0, 0
def pack_resolution(resolution : Tuple[float, float]) -> str: def pack_resolution(resolution : Resolution) -> str:
width, height = normalize_resolution(resolution) width, height = normalize_resolution(resolution)
return str(width) + 'x' + str(height) return str(width) + 'x' + str(height)
@ -91,8 +163,9 @@ def unpack_resolution(resolution : str) -> Resolution:
return width, height return width, height
def resize_frame_resolution(vision_frame : VisionFrame, max_width : int, max_height : int) -> VisionFrame: def resize_frame_resolution(vision_frame : VisionFrame, max_resolution : Resolution) -> VisionFrame:
height, width = vision_frame.shape[:2] height, width = vision_frame.shape[:2]
max_width, max_height = max_resolution
if height > max_height or width > max_width: if height > max_height or width > max_width:
scale = min(max_height / height, max_width / width) scale = min(max_height / height, max_width / width)
@ -106,26 +179,40 @@ def normalize_frame_color(vision_frame : VisionFrame) -> VisionFrame:
return cv2.cvtColor(vision_frame, cv2.COLOR_BGR2RGB) return cv2.cvtColor(vision_frame, cv2.COLOR_BGR2RGB)
@lru_cache(maxsize = 128) def create_tile_frames(vision_frame : VisionFrame, size : Size) -> Tuple[List[VisionFrame], int, int]:
def read_static_image(image_path : str) -> Optional[VisionFrame]: vision_frame = numpy.pad(vision_frame, ((size[1], size[1]), (size[1], size[1]), (0, 0)))
return read_image(image_path) tile_width = size[0] - 2 * size[2]
pad_size_bottom = size[2] + tile_width - vision_frame.shape[0] % tile_width
pad_size_right = size[2] + tile_width - vision_frame.shape[1] % tile_width
pad_vision_frame = numpy.pad(vision_frame, ((size[2], pad_size_bottom), (size[2], pad_size_right), (0, 0)))
pad_height, pad_width = pad_vision_frame.shape[:2]
row_range = range(size[2], pad_height - size[2], tile_width)
col_range = range(size[2], pad_width - size[2], tile_width)
tile_vision_frames = []
for row_vision_frame in row_range:
top = row_vision_frame - size[2]
bottom = row_vision_frame + size[2] + tile_width
for column_vision_frame in col_range:
left = column_vision_frame - size[2]
right = column_vision_frame + size[2] + tile_width
tile_vision_frames.append(pad_vision_frame[top:bottom, left:right, :])
return tile_vision_frames, pad_width, pad_height
def read_static_images(image_paths : List[str]) -> Optional[List[VisionFrame]]: def merge_tile_frames(tile_vision_frames : List[VisionFrame], temp_width : int, temp_height : int, pad_width : int, pad_height : int, size : Size) -> VisionFrame:
frames = [] merge_vision_frame = numpy.zeros((pad_height, pad_width, 3)).astype(numpy.uint8)
if image_paths: tile_width = tile_vision_frames[0].shape[1] - 2 * size[2]
for image_path in image_paths: tiles_per_row = min(pad_width // tile_width, len(tile_vision_frames))
frames.append(read_static_image(image_path))
return frames
for index, tile_vision_frame in enumerate(tile_vision_frames):
def read_image(image_path : str) -> Optional[VisionFrame]: tile_vision_frame = tile_vision_frame[size[2]:-size[2], size[2]:-size[2]]
if is_image(image_path): row_index = index // tiles_per_row
return cv2.imread(image_path) col_index = index % tiles_per_row
return None top = row_index * tile_vision_frame.shape[0]
bottom = top + tile_vision_frame.shape[0]
left = col_index * tile_vision_frame.shape[1]
def write_image(image_path : str, frame : VisionFrame) -> bool: right = left + tile_vision_frame.shape[1]
if image_path: merge_vision_frame[top:bottom, left:right, :] = tile_vision_frame
return cv2.imwrite(image_path, frame) merge_vision_frame = merge_vision_frame[size[1] : size[1] + temp_height, size[1]: size[1] + temp_width, :]
return False return merge_vision_frame

View File

@ -5,19 +5,27 @@ WORDING : Dict[str, Any] =\
'python_not_supported': 'Python version is not supported, upgrade to {version} or higher', 'python_not_supported': 'Python version is not supported, upgrade to {version} or higher',
'ffmpeg_not_installed': 'FFMpeg is not installed', 'ffmpeg_not_installed': 'FFMpeg is not installed',
'creating_temp': 'Creating temporary resources', 'creating_temp': 'Creating temporary resources',
'extracting_frames_fps': 'Extracting frames with {video_fps} FPS', 'extracting_frames': 'Extracting frames with a resolution of {resolution} and {fps} frames per second',
'extracting_frames_succeed': 'Extracting frames succeed',
'extracting_frames_failed': 'Extracting frames failed',
'analysing': 'Analysing', 'analysing': 'Analysing',
'processing': 'Processing', 'processing': 'Processing',
'downloading': 'Downloading', 'downloading': 'Downloading',
'temp_frames_not_found': 'Temporary frames not found', 'temp_frames_not_found': 'Temporary frames not found',
'compressing_image_succeed': 'Compressing image succeed', 'copying_image': 'Copying image with a resolution of {resolution}',
'compressing_image_skipped': 'Compressing image skipped', 'copying_image_succeed': 'Copying image succeed',
'merging_video_fps': 'Merging video with {video_fps} FPS', 'copying_image_failed': 'Copying image failed',
'finalizing_image': 'Finalizing image with a resolution of {resolution}',
'finalizing_image_succeed': 'Finalizing image succeed',
'finalizing_image_skipped': 'Finalizing image skipped',
'merging_video': 'Merging video with a resolution of {resolution} and {fps} frames per second',
'merging_video_succeed': 'Merging video succeed',
'merging_video_failed': 'Merging video failed', 'merging_video_failed': 'Merging video failed',
'skipping_audio': 'Skipping audio', 'skipping_audio': 'Skipping audio',
'restoring_audio_succeed': 'Restoring audio succeed', 'restoring_audio_succeed': 'Restoring audio succeed',
'restoring_audio_skipped': 'Restoring audio skipped', 'restoring_audio_skipped': 'Restoring audio skipped',
'clearing_temp': 'Clearing temporary resources', 'clearing_temp': 'Clearing temporary resources',
'processing_stopped': 'Processing stopped',
'processing_image_succeed': 'Processing to image succeed in {seconds} seconds', 'processing_image_succeed': 'Processing to image succeed in {seconds} seconds',
'processing_image_failed': 'Processing to image failed', 'processing_image_failed': 'Processing to image failed',
'processing_video_succeed': 'Processing to video succeed in {seconds} seconds', 'processing_video_succeed': 'Processing to video succeed in {seconds} seconds',
@ -67,8 +75,9 @@ WORDING : Dict[str, Any] =\
'face_detector_model': 'choose the model responsible for detecting the face', 'face_detector_model': 'choose the model responsible for detecting the face',
'face_detector_size': 'specify the size of the frame provided to the face detector', 'face_detector_size': 'specify the size of the frame provided to the face detector',
'face_detector_score': 'filter the detected faces base on the confidence score', 'face_detector_score': 'filter the detected faces base on the confidence score',
'face_landmarker_score': 'filter the detected landmarks base on the confidence score',
# face selector # face selector
'face_selector_mode': 'use reference based tracking with simple matching', 'face_selector_mode': 'use reference based tracking or simple matching',
'reference_face_position': 'specify the position used to create the reference face', 'reference_face_position': 'specify the position used to create the reference face',
'reference_face_distance': 'specify the desired similarity between the reference face and target face', 'reference_face_distance': 'specify the desired similarity between the reference face and target face',
'reference_frame_number': 'specify the frame used to create the reference face', 'reference_frame_number': 'specify the frame used to create the reference face',
@ -81,10 +90,10 @@ WORDING : Dict[str, Any] =\
'trim_frame_start': 'specify the the start frame of the target video', 'trim_frame_start': 'specify the the start frame of the target video',
'trim_frame_end': 'specify the the end frame of the target video', 'trim_frame_end': 'specify the the end frame of the target video',
'temp_frame_format': 'specify the temporary resources format', 'temp_frame_format': 'specify the temporary resources format',
'temp_frame_quality': 'specify the temporary resources quality',
'keep_temp': 'keep the temporary resources after processing', 'keep_temp': 'keep the temporary resources after processing',
# output creation # output creation
'output_image_quality': 'specify the image quality which translates to the compression factor', 'output_image_quality': 'specify the image quality which translates to the compression factor',
'output_image_resolution': 'specify the image output resolution based on the target image',
'output_video_encoder': 'specify the encoder use for the video compression', 'output_video_encoder': 'specify the encoder use for the video compression',
'output_video_preset': 'balance fast video processing and video file size', 'output_video_preset': 'balance fast video processing and video file size',
'output_video_quality': 'specify the video quality which translates to the compression factor', 'output_video_quality': 'specify the video quality which translates to the compression factor',
@ -131,6 +140,7 @@ WORDING : Dict[str, Any] =\
'face_detector_model_dropdown': 'FACE DETECTOR MODEL', 'face_detector_model_dropdown': 'FACE DETECTOR MODEL',
'face_detector_size_dropdown': 'FACE DETECTOR SIZE', 'face_detector_size_dropdown': 'FACE DETECTOR SIZE',
'face_detector_score_slider': 'FACE DETECTOR SCORE', 'face_detector_score_slider': 'FACE DETECTOR SCORE',
'face_landmarker_score_slider': 'FACE LANDMARKER SCORE',
# face masker # face masker
'face_mask_types_checkbox_group': 'FACE MASK TYPES', 'face_mask_types_checkbox_group': 'FACE MASK TYPES',
'face_mask_blur_slider': 'FACE MASK BLUR', 'face_mask_blur_slider': 'FACE MASK BLUR',
@ -161,6 +171,7 @@ WORDING : Dict[str, Any] =\
# output options # output options
'output_path_textbox': 'OUTPUT PATH', 'output_path_textbox': 'OUTPUT PATH',
'output_image_quality_slider': 'OUTPUT IMAGE QUALITY', 'output_image_quality_slider': 'OUTPUT IMAGE QUALITY',
'output_image_resolution_dropdown': 'OUTPUT IMAGE RESOLUTION',
'output_video_encoder_dropdown': 'OUTPUT VIDEO ENCODER', 'output_video_encoder_dropdown': 'OUTPUT VIDEO ENCODER',
'output_video_preset_dropdown': 'OUTPUT VIDEO PRESET', 'output_video_preset_dropdown': 'OUTPUT VIDEO PRESET',
'output_video_quality_slider': 'OUTPUT VIDEO QUALITY', 'output_video_quality_slider': 'OUTPUT VIDEO QUALITY',
@ -175,7 +186,6 @@ WORDING : Dict[str, Any] =\
'target_file': 'TARGET', 'target_file': 'TARGET',
# temp frame # temp frame
'temp_frame_format_dropdown': 'TEMP FRAME FORMAT', 'temp_frame_format_dropdown': 'TEMP FRAME FORMAT',
'temp_frame_quality_slider': 'TEMP FRAME QUALITY',
# trim frame # trim frame
'trim_frame_start_slider': 'TRIM FRAME START', 'trim_frame_start_slider': 'TRIM FRAME START',
'trim_frame_end_slider': 'TRIM FRAME END', 'trim_frame_end_slider': 'TRIM FRAME END',

View File

@ -1,11 +1,9 @@
basicsr==1.4.2
filetype==1.2.0 filetype==1.2.0
gradio==3.50.2 gradio==3.50.2
numpy==1.26.2 numpy==1.26.4
onnx==1.15.0 onnx==1.15.0
onnxruntime==1.16.3 onnxruntime==1.16.3
opencv-python==4.8.1.78 opencv-python==4.8.1.78
psutil==5.9.6 psutil==5.9.8
realesrgan==0.3.0 tqdm==4.66.2
torch==2.1.2 scipy==1.12.0
tqdm==4.66.1

View File

@ -1,4 +1,4 @@
from facefusion.common_helper import create_metavar, create_int_range, create_float_range from facefusion.common_helper import create_metavar, create_int_range, create_float_range, extract_major_version
def test_create_metavar() -> None: def test_create_metavar() -> None:
@ -13,3 +13,9 @@ def test_create_int_range() -> None:
def test_create_float_range() -> None: def test_create_float_range() -> None:
assert create_float_range(0.0, 1.0, 0.5) == [ 0.0, 0.5, 1.0 ] assert create_float_range(0.0, 1.0, 0.5) == [ 0.0, 0.5, 1.0 ]
assert create_float_range(0.0, 0.2, 0.05) == [ 0.0, 0.05, 0.10, 0.15, 0.20 ] assert create_float_range(0.0, 0.2, 0.05) == [ 0.0, 0.05, 0.10, 0.15, 0.20 ]
def test_extract_major_version() -> None:
assert extract_major_version('1') == (1, 0)
assert extract_major_version('1.1') == (1, 1)
assert extract_major_version('1.2.0') == (1, 2)

View File

@ -1,4 +1,4 @@
from facefusion.execution_helper import encode_execution_providers, decode_execution_providers, apply_execution_provider_options, map_torch_backend from facefusion.execution import encode_execution_providers, decode_execution_providers, apply_execution_provider_options
def test_encode_execution_providers() -> None: def test_encode_execution_providers() -> None:
@ -19,8 +19,3 @@ def test_multiple_execution_providers() -> None:
}) })
] ]
assert apply_execution_provider_options([ 'CPUExecutionProvider', 'CUDAExecutionProvider' ]) == execution_provider_with_options assert apply_execution_provider_options([ 'CPUExecutionProvider', 'CUDAExecutionProvider' ]) == execution_provider_with_options
def test_map_device() -> None:
assert map_torch_backend([ 'CPUExecutionProvider' ]) == 'cpu'
assert map_torch_backend([ 'CPUExecutionProvider', 'CUDAExecutionProvider' ]) == 'cuda'

View File

@ -21,14 +21,33 @@ def before_all() -> None:
@pytest.fixture(autouse = True) @pytest.fixture(autouse = True)
def before_each() -> None: def before_each() -> None:
facefusion.globals.face_detector_score = 0.5
facefusion.globals.face_landmarker_score = 0.5
facefusion.globals.face_recognizer_model = 'arcface_inswapper'
clear_face_analyser() clear_face_analyser()
def test_get_one_face_with_retinaface() -> None: def test_get_one_face_with_retinaface() -> None:
facefusion.globals.face_detector_model = 'retinaface' facefusion.globals.face_detector_model = 'retinaface'
facefusion.globals.face_detector_size = '320x320' facefusion.globals.face_detector_size = '320x320'
facefusion.globals.face_detector_score = 0.5
facefusion.globals.face_recognizer_model = 'arcface_inswapper' source_paths =\
[
'.assets/examples/source.jpg',
'.assets/examples/source-80crop.jpg',
'.assets/examples/source-70crop.jpg',
'.assets/examples/source-60crop.jpg'
]
for source_path in source_paths:
source_frame = read_static_image(source_path)
face = get_one_face(source_frame)
assert isinstance(face, Face)
def test_get_one_face_with_scrfd() -> None:
facefusion.globals.face_detector_model = 'scrfd'
facefusion.globals.face_detector_size = '640x640'
source_paths =\ source_paths =\
[ [
@ -47,8 +66,6 @@ def test_get_one_face_with_retinaface() -> None:
def test_get_one_face_with_yoloface() -> None: def test_get_one_face_with_yoloface() -> None:
facefusion.globals.face_detector_model = 'yoloface' facefusion.globals.face_detector_model = 'yoloface'
facefusion.globals.face_detector_size = '640x640' facefusion.globals.face_detector_size = '640x640'
facefusion.globals.face_detector_score = 0.5
facefusion.globals.face_recognizer_model = 'arcface_inswapper'
source_paths =\ source_paths =\
[ [
@ -67,8 +84,6 @@ def test_get_one_face_with_yoloface() -> None:
def test_get_one_face_with_yunet() -> None: def test_get_one_face_with_yunet() -> None:
facefusion.globals.face_detector_model = 'yunet' facefusion.globals.face_detector_model = 'yunet'
facefusion.globals.face_detector_size = '640x640' facefusion.globals.face_detector_size = '640x640'
facefusion.globals.face_detector_score = 0.5
facefusion.globals.face_recognizer_model = 'arcface_inswapper'
source_paths =\ source_paths =\
[ [

View File

@ -3,6 +3,7 @@ import subprocess
import pytest import pytest
import facefusion.globals import facefusion.globals
from facefusion import process_manager
from facefusion.filesystem import get_temp_directory_path, create_temp, clear_temp from facefusion.filesystem import get_temp_directory_path, create_temp, clear_temp
from facefusion.download import conditional_download from facefusion.download import conditional_download
from facefusion.ffmpeg import extract_frames, read_audio_buffer from facefusion.ffmpeg import extract_frames, read_audio_buffer
@ -10,6 +11,7 @@ from facefusion.ffmpeg import extract_frames, read_audio_buffer
@pytest.fixture(scope = 'module', autouse = True) @pytest.fixture(scope = 'module', autouse = True)
def before_all() -> None: def before_all() -> None:
process_manager.start()
conditional_download('.assets/examples', conditional_download('.assets/examples',
[ [
'https://github.com/facefusion/facefusion-assets/releases/download/examples/source.jpg', 'https://github.com/facefusion/facefusion-assets/releases/download/examples/source.jpg',
@ -26,7 +28,6 @@ def before_all() -> None:
def before_each() -> None: def before_each() -> None:
facefusion.globals.trim_frame_start = None facefusion.globals.trim_frame_start = None
facefusion.globals.trim_frame_end = None facefusion.globals.trim_frame_end = None
facefusion.globals.temp_frame_quality = 80
facefusion.globals.temp_frame_format = 'jpg' facefusion.globals.temp_frame_format = 'jpg'
@ -37,6 +38,7 @@ def test_extract_frames() -> None:
'.assets/examples/target-240p-30fps.mp4', '.assets/examples/target-240p-30fps.mp4',
'.assets/examples/target-240p-60fps.mp4' '.assets/examples/target-240p-60fps.mp4'
] ]
for target_path in target_paths: for target_path in target_paths:
temp_directory_path = get_temp_directory_path(target_path) temp_directory_path = get_temp_directory_path(target_path)
create_temp(target_path) create_temp(target_path)
@ -55,6 +57,7 @@ def test_extract_frames_with_trim_start() -> None:
('.assets/examples/target-240p-30fps.mp4', 100), ('.assets/examples/target-240p-30fps.mp4', 100),
('.assets/examples/target-240p-60fps.mp4', 212) ('.assets/examples/target-240p-60fps.mp4', 212)
] ]
for target_path, frame_total in data_provider: for target_path, frame_total in data_provider:
temp_directory_path = get_temp_directory_path(target_path) temp_directory_path = get_temp_directory_path(target_path)
create_temp(target_path) create_temp(target_path)
@ -74,6 +77,7 @@ def test_extract_frames_with_trim_start_and_trim_end() -> None:
('.assets/examples/target-240p-30fps.mp4', 100), ('.assets/examples/target-240p-30fps.mp4', 100),
('.assets/examples/target-240p-60fps.mp4', 50) ('.assets/examples/target-240p-60fps.mp4', 50)
] ]
for target_path, frame_total in data_provider: for target_path, frame_total in data_provider:
temp_directory_path = get_temp_directory_path(target_path) temp_directory_path = get_temp_directory_path(target_path)
create_temp(target_path) create_temp(target_path)
@ -92,6 +96,7 @@ def test_extract_frames_with_trim_end() -> None:
('.assets/examples/target-240p-30fps.mp4', 100), ('.assets/examples/target-240p-30fps.mp4', 100),
('.assets/examples/target-240p-60fps.mp4', 50) ('.assets/examples/target-240p-60fps.mp4', 50)
] ]
for target_path, frame_total in data_provider: for target_path, frame_total in data_provider:
temp_directory_path = get_temp_directory_path(target_path) temp_directory_path = get_temp_directory_path(target_path)
create_temp(target_path) create_temp(target_path)

View File

@ -4,17 +4,16 @@ from facefusion.normalizer import normalize_output_path, normalize_padding, norm
def test_normalize_output_path() -> None: def test_normalize_output_path() -> None:
if platform.system().lower() != 'windows': if platform.system().lower() == 'linux' or platform.system().lower() == 'darwin':
assert normalize_output_path([ '.assets/examples/source.jpg' ], None, '.assets/examples/target-240p.mp4') == '.assets/examples/target-240p.mp4' assert normalize_output_path('.assets/examples/target-240p.mp4', '.assets/examples/target-240p.mp4') == '.assets/examples/target-240p.mp4'
assert normalize_output_path(None, '.assets/examples/target-240p.mp4', '.assets/examples/target-240p.mp4') == '.assets/examples/target-240p.mp4' assert normalize_output_path('.assets/examples/target-240p.mp4', '.assets/examples').startswith('.assets/examples/target-240p')
assert normalize_output_path(None, '.assets/examples/target-240p.mp4', '.assets/examples') == '.assets/examples/target-240p.mp4' assert normalize_output_path('.assets/examples/target-240p.mp4', '.assets/examples').endswith('.mp4')
assert normalize_output_path([ '.assets/examples/source.jpg' ], '.assets/examples/target-240p.mp4', '.assets/examples') == '.assets/examples/source-target-240p.mp4' assert normalize_output_path('.assets/examples/target-240p.mp4', '.assets/examples/output.mp4') == '.assets/examples/output.mp4'
assert normalize_output_path(None, '.assets/examples/target-240p.mp4', '.assets/examples/output.mp4') == '.assets/examples/output.mp4' assert normalize_output_path('.assets/examples/target-240p.mp4', '.assets/examples/invalid') is None
assert normalize_output_path(None, '.assets/examples/target-240p.mp4', '.assets/output.mov') == '.assets/output.mp4' assert normalize_output_path('.assets/examples/target-240p.mp4', '.assets/invalid/output.mp4') is None
assert normalize_output_path(None, '.assets/examples/target-240p.mp4', '.assets/examples/invalid') is None assert normalize_output_path('.assets/examples/target-240p.mp4', 'invalid') is None
assert normalize_output_path(None, '.assets/examples/target-240p.mp4', '.assets/invalid/output.mp4') is None assert normalize_output_path('.assets/examples/target-240p.mp4', None) is None
assert normalize_output_path(None, '.assets/examples/target-240p.mp4', 'invalid') is None assert normalize_output_path(None, '.assets/examples/output.mp4') is None
assert normalize_output_path([ '.assets/examples/source.jpg' ], '.assets/examples/target-240p.mp4', None) is None
def test_normalize_padding() -> None: def test_normalize_padding() -> None:

View File

@ -0,0 +1,22 @@
from facefusion.process_manager import set_process_state, is_processing, is_stopping, is_pending, start, stop, end
def test_start() -> None:
set_process_state('pending')
start()
assert is_processing()
def test_stop() -> None:
set_process_state('processing')
stop()
assert is_stopping()
def test_end() -> None:
set_process_state('processing')
end()
assert is_pending()

View File

@ -2,7 +2,7 @@ import subprocess
import pytest import pytest
from facefusion.download import conditional_download from facefusion.download import conditional_download
from facefusion.vision import get_video_frame, count_video_frame_total, detect_video_fps, detect_video_resolution, pack_resolution, unpack_resolution, create_video_resolutions from facefusion.vision import detect_image_resolution, restrict_image_resolution, create_image_resolutions, get_video_frame, count_video_frame_total, detect_video_fps, restrict_video_fps, detect_video_resolution, restrict_video_resolution, create_video_resolutions, normalize_resolution, pack_resolution, unpack_resolution
@pytest.fixture(scope = 'module', autouse = True) @pytest.fixture(scope = 'module', autouse = True)
@ -13,6 +13,10 @@ def before_all() -> None:
'https://github.com/facefusion/facefusion-assets/releases/download/examples/target-240p.mp4', 'https://github.com/facefusion/facefusion-assets/releases/download/examples/target-240p.mp4',
'https://github.com/facefusion/facefusion-assets/releases/download/examples/target-1080p.mp4' 'https://github.com/facefusion/facefusion-assets/releases/download/examples/target-1080p.mp4'
]) ])
subprocess.run([ 'ffmpeg', '-i', '.assets/examples/target-240p.mp4', '-vframes', '1', '.assets/examples/target-240p.jpg' ])
subprocess.run([ 'ffmpeg', '-i', '.assets/examples/target-1080p.mp4', '-vframes', '1', '.assets/examples/target-1080p.jpg' ])
subprocess.run([ 'ffmpeg', '-i', '.assets/examples/target-240p.mp4', '-vframes', '1', '-vf', 'transpose=0', '.assets/examples/target-240p-90deg.jpg' ])
subprocess.run([ 'ffmpeg', '-i', '.assets/examples/target-1080p.mp4', '-vframes', '1', '-vf', 'transpose=0', '.assets/examples/target-1080p-90deg.jpg' ])
subprocess.run([ 'ffmpeg', '-i', '.assets/examples/target-240p.mp4', '-vf', 'fps=25', '.assets/examples/target-240p-25fps.mp4' ]) subprocess.run([ 'ffmpeg', '-i', '.assets/examples/target-240p.mp4', '-vf', 'fps=25', '.assets/examples/target-240p-25fps.mp4' ])
subprocess.run([ 'ffmpeg', '-i', '.assets/examples/target-240p.mp4', '-vf', 'fps=30', '.assets/examples/target-240p-30fps.mp4' ]) subprocess.run([ 'ffmpeg', '-i', '.assets/examples/target-240p.mp4', '-vf', 'fps=30', '.assets/examples/target-240p-30fps.mp4' ])
subprocess.run([ 'ffmpeg', '-i', '.assets/examples/target-240p.mp4', '-vf', 'fps=60', '.assets/examples/target-240p-60fps.mp4' ]) subprocess.run([ 'ffmpeg', '-i', '.assets/examples/target-240p.mp4', '-vf', 'fps=60', '.assets/examples/target-240p-60fps.mp4' ])
@ -20,6 +24,28 @@ def before_all() -> None:
subprocess.run([ 'ffmpeg', '-i', '.assets/examples/target-1080p.mp4', '-vf', 'transpose=0', '.assets/examples/target-1080p-90deg.mp4' ]) subprocess.run([ 'ffmpeg', '-i', '.assets/examples/target-1080p.mp4', '-vf', 'transpose=0', '.assets/examples/target-1080p-90deg.mp4' ])
def test_detect_image_resolution() -> None:
assert detect_image_resolution('.assets/examples/target-240p.jpg') == (426, 226)
assert detect_image_resolution('.assets/examples/target-240p-90deg.jpg') == (226, 426)
assert detect_image_resolution('.assets/examples/target-1080p.jpg') == (2048, 1080)
assert detect_image_resolution('.assets/examples/target-1080p-90deg.jpg') == (1080, 2048)
assert detect_image_resolution('invalid') is None
def test_restrict_image_resolution() -> None:
assert restrict_image_resolution('.assets/examples/target-1080p.jpg', (426, 226)) == (426, 226)
assert restrict_image_resolution('.assets/examples/target-1080p.jpg', (2048, 1080)) == (2048, 1080)
assert restrict_image_resolution('.assets/examples/target-1080p.jpg', (4096, 2160)) == (2048, 1080)
def test_create_image_resolutions() -> None:
assert create_image_resolutions((426, 226)) == [ '106x56', '212x112', '320x170', '426x226', '640x340', '852x452', '1064x564', '1278x678', '1492x792', '1704x904' ]
assert create_image_resolutions((226, 426)) == [ '56x106', '112x212', '170x320', '226x426', '340x640', '452x852', '564x1064', '678x1278', '792x1492', '904x1704' ]
assert create_image_resolutions((2048, 1080)) == [ '512x270', '1024x540', '1536x810', '2048x1080', '3072x1620', '4096x2160', '5120x2700', '6144x3240', '7168x3780', '8192x4320' ]
assert create_image_resolutions((1080, 2048)) == [ '270x512', '540x1024', '810x1536', '1080x2048', '1620x3072', '2160x4096', '2700x5120', '3240x6144', '3780x7168', '4320x8192' ]
assert create_image_resolutions(None) == []
def test_get_video_frame() -> None: def test_get_video_frame() -> None:
assert get_video_frame('.assets/examples/target-240p-25fps.mp4') is not None assert get_video_frame('.assets/examples/target-240p-25fps.mp4') is not None
assert get_video_frame('invalid') is None assert get_video_frame('invalid') is None
@ -39,25 +65,45 @@ def test_detect_video_fps() -> None:
assert detect_video_fps('invalid') is None assert detect_video_fps('invalid') is None
def test_restrict_video_fps() -> None:
assert restrict_video_fps('.assets/examples/target-1080p.mp4', 20.0) == 20.0
assert restrict_video_fps('.assets/examples/target-1080p.mp4', 25.0) == 25.0
assert restrict_video_fps('.assets/examples/target-1080p.mp4', 60.0) == 25.0
def test_detect_video_resolution() -> None: def test_detect_video_resolution() -> None:
assert detect_video_resolution('.assets/examples/target-240p.mp4') == (426.0, 226.0) assert detect_video_resolution('.assets/examples/target-240p.mp4') == (426, 226)
assert detect_video_resolution('.assets/examples/target-1080p.mp4') == (2048.0, 1080.0) assert detect_video_resolution('.assets/examples/target-240p-90deg.mp4') == (226, 426)
assert detect_video_resolution('.assets/examples/target-1080p.mp4') == (2048, 1080)
assert detect_video_resolution('.assets/examples/target-1080p-90deg.mp4') == (1080, 2048)
assert detect_video_resolution('invalid') is None assert detect_video_resolution('invalid') is None
def test_restrict_video_resolution() -> None:
assert restrict_video_resolution('.assets/examples/target-1080p.mp4', (426, 226)) == (426, 226)
assert restrict_video_resolution('.assets/examples/target-1080p.mp4', (2048, 1080)) == (2048, 1080)
assert restrict_video_resolution('.assets/examples/target-1080p.mp4', (4096, 2160)) == (2048, 1080)
def test_create_video_resolutions() -> None:
assert create_video_resolutions((426, 226)) == [ '426x226', '452x240', '678x360', '904x480', '1018x540', '1358x720', '2036x1080', '2714x1440', '4072x2160', '8144x4320' ]
assert create_video_resolutions((226, 426)) == [ '226x426', '240x452', '360x678', '480x904', '540x1018', '720x1358', '1080x2036', '1440x2714', '2160x4072', '4320x8144' ]
assert create_video_resolutions((2048, 1080)) == [ '456x240', '682x360', '910x480', '1024x540', '1366x720', '2048x1080', '2730x1440', '4096x2160', '8192x4320' ]
assert create_video_resolutions((1080, 2048)) == [ '240x456', '360x682', '480x910', '540x1024', '720x1366', '1080x2048', '1440x2730', '2160x4096', '4320x8192' ]
assert create_video_resolutions(None) == []
def test_normalize_resolution() -> None:
assert normalize_resolution((2.5, 2.5)) == (2, 2)
assert normalize_resolution((3.0, 3.0)) == (4, 4)
assert normalize_resolution((6.5, 6.5)) == (6, 6)
def test_pack_resolution() -> None: def test_pack_resolution() -> None:
assert pack_resolution((1.0, 1.0)) == '0x0' assert pack_resolution((1, 1)) == '0x0'
assert pack_resolution((2.0, 2.0)) == '2x2' assert pack_resolution((2, 2)) == '2x2'
def test_unpack_resolution() -> None: def test_unpack_resolution() -> None:
assert unpack_resolution('0x0') == (0, 0) assert unpack_resolution('0x0') == (0, 0)
assert unpack_resolution('2x2') == (2, 2) assert unpack_resolution('2x2') == (2, 2)
def test_create_video_resolutions() -> None:
assert create_video_resolutions('.assets/examples/target-240p.mp4') == [ '426x226', '452x240', '678x360', '904x480', '1018x540', '1358x720', '2036x1080', '2714x1440', '4072x2160' ]
assert create_video_resolutions('.assets/examples/target-240p-90deg.mp4') == [ '226x426', '240x452', '360x678', '480x904', '540x1018', '720x1358', '1080x2036', '1440x2714', '2160x4072' ]
assert create_video_resolutions('.assets/examples/target-1080p.mp4') == [ '456x240', '682x360', '910x480', '1024x540', '1366x720', '2048x1080', '2730x1440', '4096x2160' ]
assert create_video_resolutions('.assets/examples/target-1080p-90deg.mp4') == [ '240x456', '360x682', '480x910', '540x1024', '720x1366', '1080x2048', '1440x2730', '2160x4096' ]
assert create_video_resolutions('invalid') is None