Here’s a fully detailed README.md for your project:
Welcome to the Ultimate Feature Launcher, a Python-based graphical user interface (GUI) application designed for launching powerful OCR and PDF-related tools with ease. Built using tkinter, this application offers a clean, user-friendly interface to help you efficiently execute tasks like advanced OCR, simple OCR, PDF-to-Audiobook conversion, and PDF translation.
-
Enhanced OCR (Gemini AI):
- Leverage a robust OCR powered by Gemini AI for advanced and accurate text extraction.
-
Simple OCR (EasyOCR):
- A lightweight and efficient OCR utility for quick and straightforward text recognition.
-
PDF to Audiobook:
- Converts PDF documents into audiobooks, making content more accessible.
-
PDF Translation:
- Translate PDF content into different languages for multilingual support.
-
Dynamic UI:
- Hover effects and responsive UI to enhance the user experience.
-
Fullscreen Mode:
- The app launches in fullscreen for better focus, but can easily exit to a windowed mode using the
Esckey.
- The app launches in fullscreen for better focus, but can easily exit to a windowed mode using the
Ensure you have the following installed on your system:
- Python 3.7 or later
- Required Python libraries:
tkinter(for GUI creation)subprocess(for running external scripts)tkinter.messagebox(for popup notifications)
To check if these are installed, run:
python --versionIf Python is installed, you can also install required libraries using pip:
pip install tkHere’s the structure of the project files:
Ultimate-Feature-Launcher/
├── interface.py # Main application script
├── Gemini_OCR_SDK.py # Script for Enhanced OCR (ensure this file exists)
├── simpleocr.py # Script for Simple OCR
├── de.py # Script for PDF to Audiobook
├── pdftrans.py # Script for PDF Translation
├── README.md # Project documentation
Place the required scripts (Gemini_OCR_SDK.py, simpleocr.py, de.py, pdftrans.py) in the same directory as main.py for seamless integration.
Follow these steps to set up and run the application:
-
Clone the Repository: Clone this repository to your local machine:
git clone https://github.com/SamirYMeshram/ocr-with-their-applications.git cd ocr-with-their-applications -
Add the Required Scripts: Ensure the following files are in the same directory:
Gemini_OCR_SDK.pysimpleocr.pyde.pypdftrans.py
-
Run the Application: Launch the GUI application by running the following command:
python main.py
- Launch the application (
main.py). - Upon startup, the app will open in fullscreen mode.
- Choose one of the available features by clicking the corresponding button:
- Enhanced OCR (Gemini AI): Launches the
Gemini_OCR_SDK.pyscript. - Simple OCR (EasyOCR): Launches the
simpleocr.pyscript. - PDF to Audiobook: Starts the
de.pyscript for audiobook conversion. - PDF Translation: Executes the
pdftrans.pyscript for translation.
- Enhanced OCR (Gemini AI): Launches the
- A popup will confirm the feature has started.
- Press the
Esckey to exit fullscreen mode.
You can extend or customize the application by:
- Adding more buttons for additional features.
- Modifying the color scheme and fonts in the
Fontandconfiguresections. - Implementing error handling for missing scripts or invalid input.
We welcome contributions! To contribute:
- Fork the repository.
- Create a new branch:
git checkout -b feature-name
- Commit your changes:
git commit -m "Add feature" - Push your changes:
git push origin feature-name
- Open a Pull Request.
-
Scripts Not Found:
- Ensure the external scripts (
Gemini_OCR_SDK.py,simpleocr.py, etc.) exist in the same directory asmain.py.
- Ensure the external scripts (
-
Dependencies Missing:
- Install missing dependencies using
pip install.
- Install missing dependencies using
-
Fullscreen Not Exiting:
- Press
Escto toggle out of fullscreen mode.
- Press
This project is licensed under the MIT License. See the LICENSE file for details.
Created by [Samir meshram]. For support, suggestions, or bug reports, feel free to reach out or open an issue.
Happy coding! 🎉
