facefusion/facefusion/uis/components/preview.py
Henry Ruhs 7609df6747
Next (#436)
* Rename landmark 5 variables

* Mark as NEXT

* Render tabs for multiple ui layout usage

* Allow many face detectors at once, Add face detector tweaks

* Remove face detector tweaks for now (kinda placebo)

* Fix lint issues

* Allow rendering the landmark-5 and landmark-5/68 via debugger

* Fix naming

* Convert face landmark based on confidence score

* Convert face landmark based on confidence score

* Add scrfd face detector model (#397)

* Add scrfd face detector model

* Switch to scrfd_2.5g.onnx model

* Just some renaming

* Downgrade OpenCV, Add SYSTEM_VERSION_COMPAT=0 for MacOS

* Improve naming

* prepare detect frame outside of semaphore

* Feat/process manager (#399)

* Minor naming

* Introduce process manager to start and stop

* Introduce process manager to start and stop

* Introduce process manager to start and stop

* Introduce process manager to start and stop

* Introduce process manager to start and stop

* Remove useless test for now

* Avoid useless variables

* Show stop once is_processing is True

* Allow to stop ffmpeg processing too

* Implement output image resolution (#403)

* Implement output image resolution

* Reorder code

* Simplify output logic and therefore fix bug

* Frame-enhancer-onnx (#404)

* changes

* changes

* changes

* changes

* add models

* update workflow

* Some cleanup

* Some cleanup

* Feat/frame enhancer polishing (#410)

* Some cleanup

* Polish the frame enhancer

* Frame Enhancer: Add more models, optimize processing

* Minor changes

* Improve readability of create_tile_frames and merge_tile_frames

* We don't have enough models yet

* Feat/face landmarker score (#413)

* Introduce face landmarker score

* Fix testing

* Fix testing

* Use release for score related sliders

* Reduce face landmark fallbacks

* Scores and landmarks in Face dict, Change color-theme in face debugger

* Scores and landmarks in Face dict, Change color-theme in face debugger

* Fix some naming

* Add 8K support (for whatever reasons)

* Fix testing

* Using get() for face.landmarks

* Introduce statistics

* More statistics

* Limit the histogram equalization

* Enable queue() for default layout

* Improve copy_image()

* Fix error when switching detector model

* Always set UI values with globals if possible

* Use different logic for output image and output video resolutions

* Enforce re-download if file size is off

* Remove unused method

* Remove unused method

* Remove unused warning filter

* Improved output path normalization (#419)

* Handle some exceptions

* Handle some exceptions

* Cleanup

* Prevent countless thread locks

* Listen to user feedback

* Fix webp edge case

* Feat/cuda device detection (#424)

* Introduce cuda device detection

* Introduce cuda device detection

* it's gtx

* Move logic to run_nvidia_smi()

* Finalize execution device naming

* Finalize execution device naming

* Merge execution_helper.py to execution.py

* Undo lowercase of values

* Undo lowercase of values

* Finalize naming

* Add missing entry to ini

* fix lip_syncer preview (#426)

* fix lip_syncer preview

* change

* Refresh preview on trim changes

* Cleanup frame enhancers and remove useless scale in merge_video() (#428)

* Keep lips over the whole video once lip syncer is enabled (#430)

* Keep lips over the whole video once lip syncer is enabled

* changes

* changes

* Fix spacing

* Use empty audio frame on silence

* Use empty audio frame on silence

* Fix ConfigParser encoding (#431)

facefusion.ini is UTF8 encoded but config.py doesn't specify encoding which results in corrupted entries when non english characters are used. 

Affected entries:
source_paths
target_path
output_path

* Adjust spacing

* Improve the GTX 16 series detection

* Use general exception to catch ParseError

* Use general exception to catch ParseError

* Host frame enhancer models4

* Use latest onnxruntime

* Minor changes in benchmark UI

* Different approach to cancel ffmpeg process

* Add support for amd amf encoders (#433)

* Add amd_amf encoders

* remove -rc cqp from amf encoder parameters

* Improve terminal output, move success messages to debug mode

* Improve terminal output, move success messages to debug mode

* Minor update

* Minor update

* onnxruntime 1.17.1 matches cuda 12.2

* Feat/improved scaling (#435)

* Prevent useless temp upscaling, Show resolution and fps in terminal output

* Remove temp frame quality

* Remove temp frame quality

* Tiny cleanup

* Default back to png for temp frames, Remove pix_fmt from frame extraction due mjpeg error

* Fix inswapper fallback by onnxruntime

* Fix inswapper fallback by major onnxruntime

* Fix inswapper fallback by major onnxruntime

* Add testing for vision restrict methods

* Fix left / right face mask regions, add left-ear and right-ear

* Flip right and left again

* Undo ears - does not work with box mask

* Prepare next release

* Fix spacing

* 100% quality when using jpg for temp frames

* Use span_kendata_x4 as default as of speed

* benchmark optimal tile and pad

* Undo commented out code

* Add real_esrgan_x4_fp16 model

* Be strict when using many face detectors

---------

Co-authored-by: Harisreedhar <46858047+harisreedhar@users.noreply.github.com>
Co-authored-by: aldemoth <159712934+aldemoth@users.noreply.github.com>
2024-03-14 19:56:54 +01:00

207 lines
8.9 KiB
Python
Executable File

from typing import Any, Dict, List, Optional
from time import sleep
import cv2
import gradio
import numpy
import facefusion.globals
from facefusion import wording, logger
from facefusion.audio import get_audio_frame, create_empty_audio_frame
from facefusion.common_helper import get_first
from facefusion.core import conditional_append_reference_faces
from facefusion.face_analyser import get_average_face, clear_face_analyser
from facefusion.face_store import clear_static_faces, get_reference_faces, clear_reference_faces
from facefusion.typing import Face, FaceSet, AudioFrame, VisionFrame
from facefusion.vision import get_video_frame, count_video_frame_total, normalize_frame_color, resize_frame_resolution, read_static_image, read_static_images
from facefusion.filesystem import is_image, is_video, filter_audio_paths
from facefusion.content_analyser import analyse_frame
from facefusion.processors.frame.core import load_frame_processor_module
from facefusion.uis.typing import ComponentName
from facefusion.uis.core import get_ui_component, register_ui_component
PREVIEW_IMAGE : Optional[gradio.Image] = None
PREVIEW_FRAME_SLIDER : Optional[gradio.Slider] = None
def render() -> None:
global PREVIEW_IMAGE
global PREVIEW_FRAME_SLIDER
preview_image_args : Dict[str, Any] =\
{
'label': wording.get('uis.preview_image'),
'interactive': False
}
preview_frame_slider_args : Dict[str, Any] =\
{
'label': wording.get('uis.preview_frame_slider'),
'step': 1,
'minimum': 0,
'maximum': 100,
'visible': False
}
conditional_append_reference_faces()
reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None
source_frames = read_static_images(facefusion.globals.source_paths)
source_face = get_average_face(source_frames)
source_audio_path = get_first(filter_audio_paths(facefusion.globals.source_paths))
if source_audio_path and facefusion.globals.output_video_fps:
source_audio_frame = get_audio_frame(source_audio_path, facefusion.globals.output_video_fps, facefusion.globals.reference_frame_number)
if not numpy.any(source_audio_frame):
source_audio_frame = create_empty_audio_frame()
else:
source_audio_frame = None
if is_image(facefusion.globals.target_path):
target_vision_frame = read_static_image(facefusion.globals.target_path)
preview_vision_frame = process_preview_frame(reference_faces, source_face, source_audio_frame, target_vision_frame)
preview_image_args['value'] = normalize_frame_color(preview_vision_frame)
if is_video(facefusion.globals.target_path):
temp_vision_frame = get_video_frame(facefusion.globals.target_path, facefusion.globals.reference_frame_number)
preview_vision_frame = process_preview_frame(reference_faces, source_face, source_audio_frame, temp_vision_frame)
preview_image_args['value'] = normalize_frame_color(preview_vision_frame)
preview_image_args['visible'] = True
preview_frame_slider_args['value'] = facefusion.globals.reference_frame_number
preview_frame_slider_args['maximum'] = count_video_frame_total(facefusion.globals.target_path)
preview_frame_slider_args['visible'] = True
PREVIEW_IMAGE = gradio.Image(**preview_image_args)
PREVIEW_FRAME_SLIDER = gradio.Slider(**preview_frame_slider_args)
register_ui_component('preview_frame_slider', PREVIEW_FRAME_SLIDER)
def listen() -> None:
PREVIEW_FRAME_SLIDER.release(update_preview_image, inputs = PREVIEW_FRAME_SLIDER, outputs = PREVIEW_IMAGE)
reference_face_position_gallery = get_ui_component('reference_face_position_gallery')
if reference_face_position_gallery:
reference_face_position_gallery.select(update_preview_image, inputs = PREVIEW_FRAME_SLIDER, outputs = PREVIEW_IMAGE)
multi_one_component_names : List[ComponentName] =\
[
'source_audio',
'source_image',
'target_image',
'target_video'
]
for component_name in multi_one_component_names:
component = get_ui_component(component_name)
if component:
for method in [ 'upload', 'change', 'clear' ]:
getattr(component, method)(update_preview_image, inputs = PREVIEW_FRAME_SLIDER, outputs = PREVIEW_IMAGE)
multi_two_component_names : List[ComponentName] =\
[
'target_image',
'target_video'
]
for component_name in multi_two_component_names:
component = get_ui_component(component_name)
if component:
for method in [ 'upload', 'change', 'clear' ]:
getattr(component, method)(update_preview_frame_slider, outputs = PREVIEW_FRAME_SLIDER)
change_one_component_names : List[ComponentName] =\
[
'face_debugger_items_checkbox_group',
'face_enhancer_blend_slider',
'frame_enhancer_blend_slider',
'trim_frame_start_slider',
'trim_frame_end_slider',
'face_selector_mode_dropdown',
'reference_face_distance_slider',
'face_mask_types_checkbox_group',
'face_mask_blur_slider',
'face_mask_padding_top_slider',
'face_mask_padding_bottom_slider',
'face_mask_padding_left_slider',
'face_mask_padding_right_slider',
'face_mask_region_checkbox_group',
'face_analyser_order_dropdown',
'face_analyser_age_dropdown',
'face_analyser_gender_dropdown',
'output_video_fps_slider'
]
for component_name in change_one_component_names:
component = get_ui_component(component_name)
if component:
component.change(update_preview_image, inputs = PREVIEW_FRAME_SLIDER, outputs = PREVIEW_IMAGE)
change_two_component_names : List[ComponentName] =\
[
'frame_processors_checkbox_group',
'face_enhancer_model_dropdown',
'face_swapper_model_dropdown',
'frame_enhancer_model_dropdown',
'lip_syncer_model_dropdown',
'face_detector_model_dropdown',
'face_detector_size_dropdown',
'face_detector_score_slider',
'face_landmarker_score_slider'
]
for component_name in change_two_component_names:
component = get_ui_component(component_name)
if component:
component.change(clear_and_update_preview_image, inputs = PREVIEW_FRAME_SLIDER, outputs = PREVIEW_IMAGE)
def clear_and_update_preview_image(frame_number : int = 0) -> gradio.Image:
clear_face_analyser()
clear_reference_faces()
clear_static_faces()
sleep(0.5)
return update_preview_image(frame_number)
def update_preview_image(frame_number : int = 0) -> gradio.Image:
for frame_processor in facefusion.globals.frame_processors:
frame_processor_module = load_frame_processor_module(frame_processor)
while not frame_processor_module.post_check():
logger.disable()
sleep(0.5)
logger.enable()
conditional_append_reference_faces()
reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None
source_frames = read_static_images(facefusion.globals.source_paths)
source_face = get_average_face(source_frames)
source_audio_path = get_first(filter_audio_paths(facefusion.globals.source_paths))
if source_audio_path and facefusion.globals.output_video_fps:
reference_audio_frame_number = facefusion.globals.reference_frame_number
if facefusion.globals.trim_frame_start:
reference_audio_frame_number -= facefusion.globals.trim_frame_start
source_audio_frame = get_audio_frame(source_audio_path, facefusion.globals.output_video_fps, reference_audio_frame_number)
if not numpy.any(source_audio_frame):
source_audio_frame = create_empty_audio_frame()
else:
source_audio_frame = None
if is_image(facefusion.globals.target_path):
target_vision_frame = read_static_image(facefusion.globals.target_path)
preview_vision_frame = process_preview_frame(reference_faces, source_face, source_audio_frame, target_vision_frame)
preview_vision_frame = normalize_frame_color(preview_vision_frame)
return gradio.Image(value = preview_vision_frame)
if is_video(facefusion.globals.target_path):
temp_vision_frame = get_video_frame(facefusion.globals.target_path, frame_number)
preview_vision_frame = process_preview_frame(reference_faces, source_face, source_audio_frame, temp_vision_frame)
preview_vision_frame = normalize_frame_color(preview_vision_frame)
return gradio.Image(value = preview_vision_frame)
return gradio.Image(value = None)
def update_preview_frame_slider() -> gradio.Slider:
if is_video(facefusion.globals.target_path):
video_frame_total = count_video_frame_total(facefusion.globals.target_path)
return gradio.Slider(maximum = video_frame_total, visible = True)
return gradio.Slider(value = None, maximum = None, visible = False)
def process_preview_frame(reference_faces : FaceSet, source_face : Face, source_audio_frame : AudioFrame, target_vision_frame : VisionFrame) -> VisionFrame:
target_vision_frame = resize_frame_resolution(target_vision_frame, (640, 640))
if analyse_frame(target_vision_frame):
return cv2.GaussianBlur(target_vision_frame, (99, 99), 0)
for frame_processor in facefusion.globals.frame_processors:
frame_processor_module = load_frame_processor_module(frame_processor)
logger.disable()
if frame_processor_module.pre_process('preview'):
logger.enable()
target_vision_frame = frame_processor_module.process_frame(
{
'reference_faces': reference_faces,
'source_face': source_face,
'source_audio_frame': source_audio_frame,
'target_vision_frame': target_vision_frame
})
return target_vision_frame