facefusion/facefusion/uis/components/preview.py
Henry Ruhs c77493ff9a
Next (#384)
* feat/yoloface (#334)

* added yolov8 to face_detector (#323)

* added yolov8 to face_detector

* added yolov8 to face_detector

* Initial cleanup and renaming

* Update README

* refactored detect_with_yoloface (#329)

* refactored detect_with_yoloface

* apply review

* Change order again

* Restore working code

* modified code (#330)

* refactored detect_with_yoloface

* apply review

* use temp_frame in detect_with_yoloface

* reorder

* modified

* reorder models

* Tiny cleanup

---------

Co-authored-by: tamoharu <133945583+tamoharu@users.noreply.github.com>

* include audio file functions (#336)

* Add testing for audio handlers

* Change order

* Fix naming

* Use correct typing in choices

* Update help message for arguments, Notation based wording approach (#347)

* Update help message for arguments, Notation based wording approach

* Fix installer

* Audio functions (#345)

* Update ffmpeg.py

* Create audio.py

* Update ffmpeg.py

* Update audio.py

* Update audio.py

* Update typing.py

* Update ffmpeg.py

* Update audio.py

* Rename Frame to VisionFrame (#346)

* Minor tidy up

* Introduce audio testing

* Add more todo for testing

* Add more todo for testing

* Fix indent

* Enable venv on the fly

* Enable venv on the fly

* Revert venv on the fly

* Revert venv on the fly

* Force Gradio to shut up

* Force Gradio to shut up

* Clear temp before processing

* Reduce terminal output

* include audio file functions

* Enforce output resolution on merge video

* Minor cleanups

* Add age and gender to face debugger items (#353)

* Add age and gender to face debugger items

* Rename like suggested in the code review

* Fix the output framerate vs. time

* Lip Sync (#356)

* Cli implementation of wav2lip

* - create get_first_item()
- remove non gan wav2lip model
- implement video memory strategy
- implement get_reference_frame()
- implement process_image()
- rearrange crop_mask_list
- implement test_cli

* Simplify testing

* Rename to lip syncer

* Fix testing

* Fix testing

* Minor cleanup

* Cuda 12 installer (#362)

* Make cuda nightly (12) the default

* Better keep legacy cuda just in case

* Use CUDA and ROCM versions

* Remove MacOS options from installer (CoreML include in default package)

* Add lip-syncer support to source component

* Add lip-syncer support to source component

* Fix the check in the source component

* Add target image check

* Introduce more helpers to suite the lip-syncer needs

* Downgrade onnxruntime as of buggy 1.17.0 release

* Revert "Downgrade onnxruntime as of buggy 1.17.0 release"

This reverts commit f4a7ae6824.

* More testing and add todos

* Fix the frame processor API to at least not throw errors

* Introduce dict based frame processor inputs (#364)

* Introduce dict based frame processor inputs

* Forgot to adjust webcam

* create path payloads (#365)

* create index payload to paths for process_frames

* rename to payload_paths

* This code now is poetry

* Fix the terminal output

* Make lip-syncer work in the preview

* Remove face debugger test for now

* Reoder reference_faces, Fix testing

* Use inswapper_128 on buggy onnxruntime 1.17.0

* Undo inswapper_128_fp16 duo broken onnxruntime 1.17.0

* Undo inswapper_128_fp16 duo broken onnxruntime 1.17.0

* Fix lip_syncer occluder & region mask issue

* Fix preview once in case there was no output video fps

* fix lip_syncer custom fps

* remove unused import

* Add 68 landmark functions (#367)

* Add 68 landmark model

* Add landmark to face object

* Re-arrange and modify typing

* Rename function

* Rearrange

* Rearrange

* ignore type

* ignore type

* change type

* ignore

* name

* Some cleanup

* Some cleanup

* Opps, I broke something

* Feat/face analyser refactoring (#369)

* Restructure face analyser and start TDD

* YoloFace and Yunet testing are passing

* Remove offset from yoloface detection

* Cleanup code

* Tiny fix

* Fix get_many_faces()

* Tiny fix (again)

* Use 320x320 fallback for retinaface

* Fix merging mashup

* Upload wave2lip model

* Upload 2dfan2 model and rename internal to face_predictor

* Downgrade onnxruntime for most cases

* Update for the face debugger to render landmark 68

* Try to make detect_face_landmark_68() and detect_gender_age() more uniform

* Enable retinaface testing for 320x320

* Make detect_face_landmark_68() and detect_gender_age() as uniform as … (#370)

* Make detect_face_landmark_68() and detect_gender_age() as uniform as possible

* Revert landmark scale and translation

* Make box-mask for lip-syncer adjustable

* Add create_bbox_from_landmark()

* Remove currently unused code

* Feat/uniface (#375)

* add uniface (#373)

* Finalize UniFace implementation

---------

Co-authored-by: Harisreedhar <46858047+harisreedhar@users.noreply.github.com>

* My approach how todo it

* edit

* edit

* replace vertical blur with gaussian

* remove region mask

* Rebase against next and restore method

* Minor improvements

* Minor improvements

* rename & add forehead padding

* Adjust and host uniface model

* Use 2dfan4 model

* Rename to face landmarker

* Feat/replace bbox with bounding box (#380)

* Add landmark 68 to 5 convertion

* Add landmark 68 to 5 convertion

* Keep 5, 5/68 and 68 landmarks

* Replace kps with landmark

* Replace bbox with bounding box

* Reshape face_landmark5_list different

* Make yoloface the default

* Move convert_face_landmark_68_to_5 to face_helper

* Minor spacing issue

* Dynamic detector sizes according to model (#382)

* Dynamic detector sizes according to model

* Dynamic detector sizes according to model

* Undo false commited files

* Add lib syncer model to the UI

* fix halo (#383)

* Bump to 2.3.0

* Update README and wording

* Update README and wording

* Fix spacing

* Apply _vision suffix

* Apply _vision suffix

* Apply _vision suffix

* Apply _vision suffix

* Apply _vision suffix

* Apply _vision suffix

* Apply _vision suffix, Move mouth mask to face_masker.py

* Apply _vision suffix

* Apply _vision suffix

* increase forehead padding

---------

Co-authored-by: tamoharu <133945583+tamoharu@users.noreply.github.com>
Co-authored-by: Harisreedhar <46858047+harisreedhar@users.noreply.github.com>
2024-02-14 14:08:29 +01:00

197 lines
8.5 KiB
Python
Executable File

from typing import Any, Dict, List, Optional
from time import sleep
import cv2
import gradio
import facefusion.globals
from facefusion import wording, logger
from facefusion.audio import get_audio_frame
from facefusion.common_helper import get_first
from facefusion.core import conditional_append_reference_faces
from facefusion.face_analyser import get_average_face, clear_face_analyser
from facefusion.face_store import clear_static_faces, get_reference_faces, clear_reference_faces
from facefusion.typing import Face, FaceSet, AudioFrame, VisionFrame
from facefusion.vision import get_video_frame, count_video_frame_total, normalize_frame_color, resize_frame_resolution, read_static_image, read_static_images
from facefusion.filesystem import is_image, is_video, filter_audio_paths
from facefusion.content_analyser import analyse_frame
from facefusion.processors.frame.core import load_frame_processor_module
from facefusion.uis.typing import ComponentName
from facefusion.uis.core import get_ui_component, register_ui_component
PREVIEW_IMAGE : Optional[gradio.Image] = None
PREVIEW_FRAME_SLIDER : Optional[gradio.Slider] = None
def render() -> None:
global PREVIEW_IMAGE
global PREVIEW_FRAME_SLIDER
preview_image_args: Dict[str, Any] =\
{
'label': wording.get('uis.preview_image'),
'interactive': False
}
preview_frame_slider_args: Dict[str, Any] =\
{
'label': wording.get('uis.preview_frame_slider'),
'step': 1,
'minimum': 0,
'maximum': 100,
'visible': False
}
conditional_append_reference_faces()
reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None
source_frames = read_static_images(facefusion.globals.source_paths)
source_face = get_average_face(source_frames)
source_audio_path = get_first(filter_audio_paths(facefusion.globals.source_paths))
if source_audio_path and facefusion.globals.output_video_fps:
source_audio_frame = get_audio_frame(source_audio_path, facefusion.globals.output_video_fps, facefusion.globals.reference_frame_number)
else:
source_audio_frame = None
if is_image(facefusion.globals.target_path):
target_vision_frame = read_static_image(facefusion.globals.target_path)
preview_vision_frame = process_preview_frame(reference_faces, source_face, source_audio_frame, target_vision_frame)
preview_image_args['value'] = normalize_frame_color(preview_vision_frame)
if is_video(facefusion.globals.target_path):
temp_vision_frame = get_video_frame(facefusion.globals.target_path, facefusion.globals.reference_frame_number)
preview_vision_frame = process_preview_frame(reference_faces, source_face, source_audio_frame, temp_vision_frame)
preview_image_args['value'] = normalize_frame_color(preview_vision_frame)
preview_image_args['visible'] = True
preview_frame_slider_args['value'] = facefusion.globals.reference_frame_number
preview_frame_slider_args['maximum'] = count_video_frame_total(facefusion.globals.target_path)
preview_frame_slider_args['visible'] = True
PREVIEW_IMAGE = gradio.Image(**preview_image_args)
PREVIEW_FRAME_SLIDER = gradio.Slider(**preview_frame_slider_args)
register_ui_component('preview_frame_slider', PREVIEW_FRAME_SLIDER)
def listen() -> None:
PREVIEW_FRAME_SLIDER.release(update_preview_image, inputs = PREVIEW_FRAME_SLIDER, outputs = PREVIEW_IMAGE)
reference_face_position_gallery = get_ui_component('reference_face_position_gallery')
if reference_face_position_gallery:
reference_face_position_gallery.select(update_preview_image, inputs = PREVIEW_FRAME_SLIDER, outputs = PREVIEW_IMAGE)
multi_one_component_names : List[ComponentName] =\
[
'source_audio',
'source_image',
'target_image',
'target_video'
]
for component_name in multi_one_component_names:
component = get_ui_component(component_name)
if component:
for method in [ 'upload', 'change', 'clear' ]:
getattr(component, method)(update_preview_image, inputs = PREVIEW_FRAME_SLIDER, outputs = PREVIEW_IMAGE)
multi_two_component_names : List[ComponentName] =\
[
'target_image',
'target_video'
]
for component_name in multi_two_component_names:
component = get_ui_component(component_name)
if component:
for method in [ 'upload', 'change', 'clear' ]:
getattr(component, method)(update_preview_frame_slider, outputs = PREVIEW_FRAME_SLIDER)
change_one_component_names : List[ComponentName] =\
[
'face_debugger_items_checkbox_group',
'face_enhancer_blend_slider',
'frame_enhancer_blend_slider',
'face_selector_mode_dropdown',
'reference_face_distance_slider',
'face_mask_types_checkbox_group',
'face_mask_blur_slider',
'face_mask_padding_top_slider',
'face_mask_padding_bottom_slider',
'face_mask_padding_left_slider',
'face_mask_padding_right_slider',
'face_mask_region_checkbox_group',
'face_analyser_order_dropdown',
'face_analyser_age_dropdown',
'face_analyser_gender_dropdown',
'output_video_fps_slider'
]
for component_name in change_one_component_names:
component = get_ui_component(component_name)
if component:
component.change(update_preview_image, inputs = PREVIEW_FRAME_SLIDER, outputs = PREVIEW_IMAGE)
change_two_component_names : List[ComponentName] =\
[
'frame_processors_checkbox_group',
'face_enhancer_model_dropdown',
'face_swapper_model_dropdown',
'frame_enhancer_model_dropdown',
'lip_syncer_model_dropdown',
'face_detector_model_dropdown',
'face_detector_size_dropdown',
'face_detector_score_slider'
]
for component_name in change_two_component_names:
component = get_ui_component(component_name)
if component:
component.change(clear_and_update_preview_image, inputs = PREVIEW_FRAME_SLIDER, outputs = PREVIEW_IMAGE)
def clear_and_update_preview_image(frame_number : int = 0) -> gradio.Image:
clear_face_analyser()
clear_reference_faces()
clear_static_faces()
sleep(0.5)
return update_preview_image(frame_number)
def update_preview_image(frame_number : int = 0) -> gradio.Image:
for frame_processor in facefusion.globals.frame_processors:
frame_processor_module = load_frame_processor_module(frame_processor)
while not frame_processor_module.post_check():
logger.disable()
sleep(0.5)
logger.enable()
conditional_append_reference_faces()
reference_faces = get_reference_faces() if 'reference' in facefusion.globals.face_selector_mode else None
source_frames = read_static_images(facefusion.globals.source_paths)
source_face = get_average_face(source_frames)
source_audio_path = get_first(filter_audio_paths(facefusion.globals.source_paths))
if source_audio_path and facefusion.globals.output_video_fps:
source_audio_frame = get_audio_frame(source_audio_path, facefusion.globals.output_video_fps, facefusion.globals.reference_frame_number)
else:
source_audio_frame = None
if is_image(facefusion.globals.target_path):
target_vision_frame = read_static_image(facefusion.globals.target_path)
preview_vision_frame = process_preview_frame(reference_faces, source_face, source_audio_frame, target_vision_frame)
preview_vision_frame = normalize_frame_color(preview_vision_frame)
return gradio.Image(value = preview_vision_frame)
if is_video(facefusion.globals.target_path):
temp_vision_frame = get_video_frame(facefusion.globals.target_path, frame_number)
preview_vision_frame = process_preview_frame(reference_faces, source_face, source_audio_frame, temp_vision_frame)
preview_vision_frame = normalize_frame_color(preview_vision_frame)
return gradio.Image(value = preview_vision_frame)
return gradio.Image(value = None)
def update_preview_frame_slider() -> gradio.Slider:
if is_video(facefusion.globals.target_path):
video_frame_total = count_video_frame_total(facefusion.globals.target_path)
return gradio.Slider(maximum = video_frame_total, visible = True)
return gradio.Slider(value = None, maximum = None, visible = False)
def process_preview_frame(reference_faces : FaceSet, source_face : Face, source_audio_frame : AudioFrame, target_vision_frame : VisionFrame) -> VisionFrame:
target_vision_frame = resize_frame_resolution(target_vision_frame, 640, 640)
if analyse_frame(target_vision_frame):
return cv2.GaussianBlur(target_vision_frame, (99, 99), 0)
for frame_processor in facefusion.globals.frame_processors:
frame_processor_module = load_frame_processor_module(frame_processor)
logger.disable()
if frame_processor_module.pre_process('preview'):
logger.enable()
target_vision_frame = frame_processor_module.process_frame(
{
'reference_faces': reference_faces,
'source_face': source_face,
'source_audio_frame': source_audio_frame,
'target_vision_frame': target_vision_frame
})
return target_vision_frame