Blog

  • BlahST

    Sample hotkey Map for BlahST. Base image source: Wikimedia BlahST – Speech Input in Any Editable Text Field

    Blah Speech-to-Text lets you have a bla(h)st inputing text from speech on Linux, with keyboard shortcuts and whisper.cpp. Fire up your microphone and perform high-quality, multilingual speech recognition offline. Extended with local LLMs, it becomes a potent tool to converse with your Linux computer. No VRAM-consuming GUI, just a few configurable hotkeys.

    BlahST is probably the leanest Whisper-based speech-to-text input tool for Linux, sitting on top of whisper.cpp.

    • Fast speech recognition with wsi via local whisper.cpp, or a whisper.cpp server for even faster network transcription.
    • Can select speech input language and translate to English with the dedicated wsiml script
    • Option to use a downloaded portable whisperfile, (with a hotkey set for wsi -w)
    • The blooper utility does continuous “hands-free” speech input or dictation with automatic pasting loop, using xdotool or ydotool. Exits on longer silence and can be reactivated with a hotkey.
    • Interaction with local LLMs via llama.cpp or a llamafile, producing textual answers or translations that are both spoken back and available in the clipboard. This functionality is in the wsiAI and blahstbot scripts.
    • EXPERIMENTAL: Added AI proofreader, triggered on any selected (editable) text by speech: “Computer, proofread … or Computer be like Grammarly..”. After a short while, the selected text should be automatically replaced by the LLM.
    • NEW: Low-latency speech chat with local LLMs (blahstbot). Natural, spoken conversation with an LLM with optional context loaded from any mouse-selected text. Please, see next video for a demo on an average Linux computer with a 12GB GPU (unmute audio) (Another demo with multilingual speech chat)
    • NEWER & EXPERMENTAL: Streaming speech-to-speech chat with blahstream, which also autopastes the text in the initially targeted window, as it arrives. Context compression via summarization.
    blahstbot.mp4

    The above video demonstrates using blahstbot for spoken interaction with Gemma3_12B, loaded in llama-server on localhost. There is no delay, the LLM actually answers quickly, making the conversation smooth. Under the hood, the script (triggered with a Gnome hotkey bound to blahstbot -n) passes the text from the recognized speech to llama-server, gets the response back, formats it and sends it to piper for TTS conversion, while also loading it in the clipboard. Note that the LLM fits completely in GPU VRAM, which helps the snappy performance.

    Using low-resource, optimized command-line tools, spoken text input happens very fast. Here is a demonstration video (please, UNMUTE the audio) with some local LLM features (AI assistant, translator, scheduller, CLI guide in testing stage):

    BlahST-AI-Demo.mp4

    In the above video, the audio starts with the system anouncing the screencasting (my GNOME extension “Voluble” speaks outloud all GNOME desktop notifications), followed by multiple turns of speech input/recognition. Demonstrated at the end is one of the upcomming “AI functions” which uses the text transcribed by BlahST (whisper.cpp), formats it into a LLM prompt and sends it to a local multilingual LLM (llama.cpp or llamafile) which returns the Chinese translation as text and also speaks it using a neural TTS. Orchestrating this from the command line with lean executables leaves the system surprisingly snappy (From the video you can see that the PC barely breaks any sweat – temperatures remain low-ish.)

    blooper-Demo.mp4

    The above video (unmute please) demonstrates the use of blooper, modified from wsi to transcribe in a loop, until the user terminates speech input with a longer pause (~3sec as preset). With the use of xdotool (or ydotool for Wayland users), text is pasted automatically on any pause (or on hotkey interuption). For the video above, the speech is generated with a synthetic voice and collected by the microphone. This allows me to edit the text concurrently (multitaskers, don’t try this at home:). At the end, the top-bar microphone icon should disappear, indicating program exit. It does not happen in the video because the screencast utility has a claim on the icon too.

    Principle of Operation (the best UI is no UI at all.)

    The idea with BlahST is to be the UI-free software equivalent of a tsunami; short and powerfull wave of CPU/GPU action and then it is gone out of the way, with only textual traces in the clipboard and relative desktop peace. Just use a pair of hotkeys to start and stop recording from the microphone and send the recorded speech to whisper.cpp [server] which dumps transcribed text into the clipboard (unless you pass it by a local LLM before that). An universal approach that should work in most Linux desktop environments and distributions.

    The work is done by one of the scripts:

    • wsi for general speech input,
    • wsiml for multilingual users,
    • wsiAI for users who want to also speak with a local large language model using llama.cpp or a llamafile.
    • blooper is an experimental tool for continouous dictation that will exit if a longer pause is detected.
    • blahstbot is a tool for spoken chat with a local (LAN) LLM, performing low-latency speech-to(-text-to)-speech conversation and making the LLM responce available to (auto)paste.
    flowchart LR
        style net fill:#1af,stroke:#333,stroke-width:2px
        E0==>ST 
        E1 ==> ST[🎤Speech-to-Text Input Anywhere]
        E1 ==> D[✨Experimental AI Tools <br> One-shot LLM Interaction]
        E2 ==>|flags -l -t|Ml[🌐 Multilingual Speech to Text.
    Translation into English]
        E3 ==> D1[⌨️ Continuous Dictation <br> Automatic Paste and Stop]
    	   D -->|"Assistant..."| D6([🤖 One-shot AI Assistant])
         D -->|"Computer..."| D2([📝 AI Proofreader, via speech keyword])
    	   D --> |"Translator..."|D4([🌍 AI Translator])
        E4 ==> D3[💬 Low-latency Speech Chat with a Local LLM 🤖]
        E5 ==> D7[💬 Even lower-latency Streaming Speech Chat with a Local LLM 🤖]
        subgraph net [<h3><b>BlahST<br></b></h3>]
    	   direction TB
    	   E0{{wsi}}
         E1{{wsiAI}}
         E2{{wsiml}}
         E3{{blooper}}
         E4{{blahstbot}}
         E5{{blahstream}}
        end
    
    Loading

    Speech recognition is performed by whisper.cpp which must be precompiled on your Linux system or available as a server instance on your LAN or localhost. Alternativelly, you can choose to simply download and use an actually portable executable (with an embedded whisper model) whisperfile, now part of the llamafile repository.

    When speech input is initiated with a hotkey, a microphone indicator appears in the top bar (at least in GNOME) and is shown for the duration of the recording (can be interupted with another hotkey). The disappearance of the microphone icon from the top bar indicates completion and the transcribed text can be pasted from the clipboard. On slower systems there may be a slight delay after the microphone icon disappears and before the text reaches the clipboard due to longer transcription time. On my computer, via the whisper.cpp server API, it is less than 150 ms (300 ms with local whisper.cpp) for an average paragraph of spoken text.

    For keyboard-only operation, with the standard CTRL+V for example, the standard clipboard will be used under X11 and Wayland (wsi or wsiml), while wsi -p (or wsiml -p) uses the PRIMARY sellection and text is pasted with the middle mouse button). For left-hand paste, speech recording can be relegated to hotkeys triggered with the right hand. ** For example I have setup the unused “+” (START RECORDING) and “Insert” (STOP RECORDING) keys on the numeric keypad.

    DATAFLOW DIAGRAMS

    wsiAI script (sample LLM interaction)

    wsiAI dataflow

    blooper (speech input in a loop)

    blooper dataflow

    USAGE SUMMARY

    • On the press of a hotkey combo, the wsi -p script will record speech (stopped with a hotkey or by silence detection), use a local copy of whisper.cpp and send the transcribed text to the PRIMARY selection under, either X11 or Wayland. Then all one has to do is paste it with the middle mouse button anywhere they want. (For people holding the mouse with their right hand, speech recording hotkeys for the left hand would make sense.)

    • If using wsi with no flags (the approaches can coexist, just set up different set of hotkeys), the transcribed text is sent to the clipboard (not PRIMARY selection) under, either X11 or Wayland. Then pasting happens with the CTRL+V (CTRL+SHIFT+V for GNOME terminal) or SHIFT+INSert keys as usual. (For most people, right-hand hotkeys would work well.)

    • If transcribing over the network with wsi -n (selected with a hotkey of its own), the script will attempt to send the recorded audio to a running, properly set whisper.cpp server (on the LAN or localhost). It will then collect the textual response and format it for pasting with the CTRL+V (CTRL+SHIFT+V for GNOME terminal) or SHIFT+INSert keys (to paste with the middle mouse button use wsi -n -p instead).

    • If using a whisperfile instead of, or in addition to a compiled whisper.cpp, invoke with wsi -w ... and the script will use the preset actually portable executable with the embedded whisper model of choice.

    • For multilingual users, in addition to the features of wsi, wsiml provides the ability to specify a language, e.g. -l fr and the option to translate to english with -t. The user can in principle assign multiple hotkeys to the various languages they transcribe or translate from. For example, two additional hotkeys can be set, one for transcribing and another for translating from French by assigning the commands wsiml -l fr and wsiml -l fr -t correspondingly.

    • blooper: Users can use the supplied script blooper for continuous, automatic speech-to-text input (no need to press CTRL+V or click middle mouse button.) This is demonstrated in the second video above. Please, note that the Clipboard is used by default, the text will be autopasted under the keyboard carret, but in principle the PRIMARY selection can be set up instead, a middle mouse button click will be simulated and the text pasted at the current position of the mouse pointer at the (somewhat arbitrary) time the text is available. Please, note that this relies on silence detection, which depends on your physical environment. In noisy environments, use the hot key to stop recording.

    • blahstbot When all one wants to do is have a spoken conversation with a local LLM, they can use blahstbot to perform UI-free speech chat with minimal latency. This can be done from a slower computer over LAN (since whisper server and llama-server are used, with a hotkey bound to blahstbot -n) and does not need to be a continuous conversation with contiguous text exchange. The user can perform other tasks between questions, use the supplied answers (available in the clipboard) and then come back later to continue within the context or change the subject (also available via RESET CONTEXT spoken command).

    • blahstream An arguably usable WIP, this tool does what blahstbot does, but in streaming fashion, as text is being generated by the LLM, it is also spoken and autopasted. blahstream manages its conversation context size via intermittent summarization and on X11, if the user moves away from the target window (where text is autopasted), the text is buffered until the user has the target window back in focus and then it is pasted, all while the LLM is streaming.


    SYSTEM SETUP

    PREREQUISITES:

    • zsh command-line shell installation on a Linux system running any modern desktop environment.
    • working whisper.cpp installation or a listening whisper.cpp server on your LAN/localhost (see network-transcription section), or optionally a downloaded whisperfile.
    • The orchestrator tools wsi, wsiAI or wsiml (along with blooper, blahstbot and blahstream) from this repository must be placed in your $HOME/.local/bin/ folder, in your $PATH. The installation script install-wsi handles most of these, but it needs to be made executable and accessible itself.
    • recent versions of ‘sox’, ‘xsel’ (or ‘wl-copy’ on Wayland) command-line tools from your system’s repositories.
    • A working microphone
    • To use the speech chat (blahstbot), a local llama.cpp installation or a listening llama-server (part of llama.cpp) is also needed.

    DISCLAIMER: The author neither takes credit nor assumes any responsibility for any outcome that may or may not result from interacting with the contents of this document. The proposed actions and automations (e.g. installation locations etc.) are merely suggestions and are based on the author’s choice and opinion. As they may not fit the taste or the particular situation of everyone, please, adjust as needed.

    INSTALLATION

    In a folder of your choice, clone the BlahST repository and then choose an installation method from below:

    git clone https://github.com/QuantiusBenignus/BlahST.git
    cd ./BlahST
    chmod +x install-wsi
    
    # If using the installation script:
    ./install-wsi
    
    USING THE INSTALLATION SCRIPT
    • Run the script install-wsi from the folder of the cloned repository and follow the prompts. It will move the scripts and make them executable, create a link to the whisper-cli executable, set the environment, set a default whisper.cpp model, check for dependencies and request their installation if missing, etc. The script will also help you with the setup a whisperfile of your choice if you select that option. The installation script also handles setup for network transcription, but the IP and port for the whisper.cpp server must be set manually in blahst.cfg.
    • User configuration for all tools has been consolidated in the single file blahst.cfg. You will edit the USER CONFIGURATION BLOCK in that file to setup your environment. Local overrides for some variables may be set in the respective scripts.
    • Run the scripts (e.g wsi or wsiAI) directly from the command line first to verify proper operation. To be invoked later with hotkeys for speed and convenience.
    MANUAL INSTALLATION

    (Assuming whisper.cpp is installed and the whisper-cli and/or whisper-server executables compiled in the cloned whisper.cpp repo. See Prerequisites section)

    • Place the scripts wsi and/or wsiAI, wsiml, blooper, blahstbot, blahstream and blahst.cfg in $HOME/.local/bin/
    • Make them executable:
      cp wsi wsiAI wsiml blooper blahstbot blahstream blahst.cfg $HOME/.local/bin/
      cd $HOME/.local/bin; chmod +x wsi wsiAI wsiml blooper blahstbot blahstream
      
    • Make sure $HOME/.local/bin is part of your $PATH ( echo $PATH ) and if not, add it (e.g. by placing export PATH="$HOME/.local/bin:$%PATH" in your .profile or .zprofile file)
    • Configure the user environment by editing blahst.cfg and the tool-specific USER_CONFIG_BLOCK in each file. Please, see CONFIGURATION below.
    • Run the tools (e.g. wsi or wsi -n) once from the command line to let the scripts check for required dependencies:
      # If .local/bin is still not in the $PATH:
      ./wsi -n
      # If .local/bin is already in the $PATH:
      wsi -n
      #Can also run with --help to get idea of the options for the specific tool:
      wsi --help
      wsiAI --help
      
    • If using local whisper.cpp, create a symbolic link (the code expects ‘transcribe’ in your $PATH) to the compiled “whisper-cli” executable in the whisper.cpp directory. For example, create it in your $HOME/.local/bin/ (part of your $PATH) with
    ln -s /full/path/to/whisper.cpp/whisper-cli $HOME/.local/bin/transcribe
    #also if using the whisper-server:
    ln -s /full/path/to/whisper.cpp/whisper-server $HOME/.local/bin/whserver
    

    If transcribe is not in your $PATH, either edit the call to it in wsi to include the absolute path, or add its location to the $PATH variable. Otherwise the script will fail. If you prefer not to compile whisper.cpp, or in addition to that, download and set the executable flag of a suitable whisperfile, for example:

    cd $HOME/.local/bin
    wget https://huggingface.co/Mozilla/whisperfile/resolve/main/whisper-tiny.en.llamafile
    chmod +x whisper-tiny.en.llamafile
    

    CONFIGURATION

    IMPORTANT: The configuration of BlahST has beeen migrated into a single file blahst.cfg that is now shared by all tools. Near the beginning of this file, there is a section, named “USER CONFIGURATION BLOCK”, where all the user-configurable variables have been collected, grouped in sections by tool. (Inside each script, there may also be a CONFIG BLOCK for local overrides, where needed.) Most settings can be left as is but the important ones are the location of the (whisper, LLM, TTS) model files that you would like to use (or the IP and port number for the whisper.cpp or llama.cpp servers). If using a whisperfile, please, set the WHISPERFILE variable to the filename of the previously downloaded executable whisperfile, i.e. WHISPERFILE=whisper-tiny.en.llamafile (must be in the $PATH).

    GUI SETUP OF HOTKEYS

    To start and stop speech input, for both manual and automatic installation

    CASE 1: GNOME
    Hotkey to start recording of speech
    • Open your GNOME system settings and find “Keyboard”.
    • Under “Keyboard shortcuts”, “View and customize shortcuts”
    • In the new window, scroll down to “Custom Shortcuts” and press it.
    • Add a new shortcut (e.g. press +) and give it a name: “Start Recording Speech”
    • In the “Command” field type /home/yourusername/.local/bin/wsi -p for using the middle mouse button or change it to .../wsi for using the clipboard.
    • (For users of the multi-lingual models, replace wsi above with wsiml and if using a whisperfile, add the -w flag, i.e. /home/yourusername/.local/bin/wsi -w ). Finally, to sample the LLM functions, replace wsi with wsiAI.
    • Then press “Set Shortcut” and select a (unused) key combination. For example a key combo like CTRL+ALT+a or a single unused key like KP+ (keypad +).
    • Click Add and you are done.

    The orchestrator script has a silence detection filter in the call to sox (rec) and would stop recording (in the best case) on 2 seconds of silence. In addition, if one does not want to wait or has issues with the silence detection threshold:

    Manual speech recording interuption (strongly recommended)

    For those who want to be able to interupt the recording manually with a key combination, in the spirit of great hacks, we are going to use the system built-in features:

    • Open your GNOME system settings and again, find “Keyboard”.
    • Under “Keyboard shortcuts”, “View and customize shortcuts”
    • In the new window, scroll down to “Custom Shortcuts” and press it.
    • Press “+” to add a new shortcut and give it a name: “Interupt Speech Input!”
    • In the “Command” field type pkill rec
    • Then press “Set Shortcut” and select a (unused) key combination. For example a key combo like CTRL+ALT+x or a single unused key like KP- (keypad -).
    • Click Add and you are done.

    That Simple. Just make sure that the new key binding has not been set-up already for something else. Now when the script is recording speech, it can be stopped with the new key combo and transcription will start immediatelly.

    CASE 2: XFCE4 This is similar to the GNOME setup above (for reference, see its more detailed instructions)
    • Open the Xfce4 Settings Manager.
    • Navigate to Keyboard → Application Shortcuts.
    • Click on the Add button to create a new shortcut.
    • Enter the name of the shortcut and the command e.g. /home/yourusername/.local/bin/wsi -p or .../wsi for using the clipboard.
    • (For users of the multi-lingual models, replace wsi above with wsiml and if using a whisperfile, add the -w flag, i.e. /home/yourusername/.local/bin/wsi -w ). Finally, to sample the LLM functions, replace wsi with wsiAI.
    • Press the keys you wish to assign to the shortcut.
    • Click OK to save the shortcut. The hotkey to stop speech recording should be done similarly with another key combo and the command pkill rec.
    CASE 3: KDE (Plasma) This is similar to the GNOME setup above (for reference, see its more detailed instructions)
    • Open the System Settings application.
    • Navigate to Shortcuts and then Custom Shortcuts.
    • Click on Edit and then New to create a new group for your shortcuts if needed.
    • Under the newly created group, click on New again and select Global Shortcut -> Command/URL.
    • Give your new shortcut a name.
    • Choose the desired shortcut key combination by clicking on the button next to “None” and pressing the keys you want to assign to the shortcut.
    • In the Trigger tab, specify the command to be executed when the shortcut is triggered. e.g. /home/yourusername/.local/bin/wsi or .../wsi -p
    • (For users of the multi-lingual models, replace wsi above with wsiml and if using a whisperfile, add the -w flag, i.e. /home/yourusername/.local/bin/wsi -w ). Finally, to sample the LLM functions, replace wsi with wsiAI.
    • Ensure that the Enabled checkbox is checked to activate the shortcut.
    • Apply the changes by clicking Apply or OK. The hotkey to stop speech recording should be done similarly with another key combo and the command pkill rec.

    Please, note that there may be slight variations in the above steps depending on the version installed on your system. For many other environements, such as Mate, Cinnamon, LXQt, Deepin, etc. the steps should be somewhat similar to the examples above. Please, consult the documentation for your systems desktop environment.


    TIPS AND TRICKS
    Sox silence detection

    Sox is recording in wav format at 16k rate, the only currently accepted by whisper.cpp. This is done in wsi with this command: rec -t wav $ramf rate 16k silence 1 0.1 3% 1 2.0 6% Sox will attempt to stop on silence of 2s with signal level threshold of 6%. A very noisy environment will prevent the detection of silence and the recording (of noise) will continue. This is a problem and a remedy that may not work in all cases is to adjust the duration and silence threshold in the sox filter in the wsi script. Of course, one can use the manual interuption method if preferred. We can’t raise the threshold arbitrarily because, if one consistently lowers their voice (fadeout) at the end of speech, it may get cut off if the threshold is high. Lower it in that case to a few %.
    It is best to try to make the speech distinguishable from noise by amplitude (speak clearly, close to the microphone), while minimizing external noise (sheltered location of the microphone, noise canceling hardware etc.) With good speech signal level, the threshold can then be more effective, since SNR (speech-to-noise ratio:-) is effectively increased. After the speech is captured, it will be passed to transcribe (whisper.cpp) for speech recognition. This will happen faster than real time (especially with a fast CPU or if your whisper.cpp installation uses CUDA). One can adjust the number of processing threads used by adding -t n to the command line parameters of transcribe (please, see whisper.cpp documentation). The script will then parse the text to remove non-speech artifacts, format it and send it to the PRIMARY selection (clipboard) using either X11 or Wayland tools.

    Multilingual Support

    In principle, whisper (whisper.cpp) is multilingual and with the correct model file, this application will output UTF-8 text transcribed in the correct language. The wsiml script is dedicated to multi-lingual use and with it the user is able to choose the language for speech input (using the -l LC flag where LC is the language code) and can also translate the speech in the chosen input language to English with the -t flag. The user can assign multiple hotkeys to the various languages that they want to transcribe or translate from. For example, two additional hotkeys can be set, one for transcribing and another for translating from French by assigning the commands wsiml -l fr and wsiml -l fr -t correspondingly.

    Please, note that when using the server mode, now you have 2 choices. You can have either the precompiled whisper.cpp server or the downloaded whisperfile (in server mode) listen at the preconfigured host and port number. The orchestrator script approaches them the same way.

    Protecting the speech input hotkey from multiple invocations

    Sometimes, one can mistakenly press the hotkey dedicated to starting wsi ( or wsiAI, blooper, blahstbot) instead of the hotkey to stop recording, while the script is already running. This may create a resource utilization mess and is obviously not desired. A way to prevent it is to expand the command assigned to the hotkey from blahstbot -n (the chatbot command) to:

    pidof -q blahstbot wsiAI || blahstbot -n
    

    but this may not work unless it is wrapped in a new shell instance. That is why we implement this protection inside the corresponding script itself. The chatbot utility is used as an example because this regime of operation (interactive speech-to-speech chat) is the most likely to suffer from user missuse the wrong hotkey, due to the increased frequency of use of these hotkeys during a chat.

    Hotkey command to end speech input

    The command initially proposed in the configuration for stoping speech recording was (2 equivalent forms)

    pkill --signal 2 rec
    pkill -SIGINT rec
    

    but in some cases, under specific load conditions, this signal may not be reliably transmited to rec (sox). Then try to use SIGTERM instead, which is a bit more aggressive but still will let rec clear its state gracefully:

    pkill rec
    pkill -SIGTERM rec
    
    Stoping speech output of the chatbot

    The LLM system prompt for the speech-to-speech blahstbot conversation mode instructs the LLM to not be too verbous. But when one finds it talking for too long, the speech can be stopped via the command:

    pkill -SIGINT aplay
    

    to which one can, of course, assign a hotkey for easy access. Alternativelly, this can be done with a mouse click, using one of the functions of the Voluble GNOME shell extension.

    Shutting down llamma-server and whisper-server.

    These are memory (and VRAM) hogs of course and sometimes they can be in the way (if you are running them on the local machine). A quick way to shut them down and free resources when not needed would be: (Note that the prompt cache will be lost).

    pkill llamserver && pkill whserver && echo "OK, the servers have been shut down. VRAM reclaimed!"
    

    In blahstbot and blahstream a “SHUTDOWN” menu item has been added to the zenity role (prompt) selection menu. This is another way to shut the servers down.

    Temporary directory and files

    Speech-to-text transcription is memory- and CPU-intensive task and fast storage for read and write access can only help. That is why wsi stores temporary and resource files in memory, for speed and to reduce SSD/HDD “grinding”: TEMPD='/dev/shm'. This mount point of type “tmpfs” is created in RAM (let’s assume that you have enough, say, at least 8GB) and is made available by the kernel for user-space applications. When the computer is shut down it is automatically wiped out, which is fine since we do not need the intermediate files. In fact, for some types of applications (looking at you Electron), it would be beneficial (IMHO) to have the systemwide /tmp mount point also kept in RAM. Moving /tmp to RAM may speed up application startup a bit. A welcome speedup for any Electron app. In its simplest form, this transition is easy, just run:

    echo "tmpfs /tmp tmpfs rw,nosuid,nodev" | sudo tee -a /etc/fstab and then restart your Linux computer. For the aforementioned reasons, especially if HDD is the main storage media, one can also move the ASR model files needed by whisper.cpp in the same location (/dev/shm). These are large files, that can be transferred to this location at the start of a terminal session (or at system startup). This can be done using your .profile file by placing something like this in it:

    ([ -f /dev/shm/ggml-base.en.bin ] || cp /path/to/your/local/whisper.cpp/models/ggml* /dev/shm/)
    
    
    cliblurt.mp4

    Contributing

    • Could be as simple as starting a new discussion with an idea to expand the use case scenarios for BlahST. ( For example use BlahST to ask Gemini questions)
    • Or send a PR for a new feature that substantially enhances BlahST

    Credits

    • Open AI (for Whisper)
    • Georgi Gerganov and community ( for Whisper’s C/C++ port whisper.cpp and for the great llama.cpp )
    • Justine Tunney, CJ Pais and the llamafile community (for llamafile and whisperfile)
    • The sox developers (for the venerable “Swiss Army knife of sound processing tools”)
    • The creators and maintainers of CLI utilities such as xsel, wl-copy, curl, jq, xdotool and others that make the Linux environment (CLI and GUI) such a powerful paradigm.
    Visit original content creator repository https://github.com/QuantiusBenignus/BlahST
  • TensorRT-v8-YOLOv5-v5.0

    TensorRT v8.2 加速部署 YOLOv5-v5.0

    项目简介

    • 使用 TensorRT 原生API构建 YOLO 网络,将 PyTorch 模型转为.plan 序列化文件,加速模型推理;
    • 基于 TensorRT 8.2.4 版本,具体环境见下方的环境构建部分;
    • 主要参考 tensorrtx 项目,但作者本人根据自己编程习惯,做了大量改动;
    • 未使用Cuda加速图像预处理的项目链接:no_cuda_preproc

    项目特点

    • 这里对比和 tensorrtx 项目中 YOLOv5-v5.0 的不同,并不是说孰优孰劣,只是有些地方更符合作者个人习惯
    tensorrtx 本项目 备注
    1 implicit(隐式 batch) explicit(显式 batch) 此不同为最大的不同,代码中很多的差异都源于此
    2 Detect Plugin 继承自 IPluginV2IOExt Detect Plugin 继承自 IPluginV2DynamicExt
    3 Detect Plugin 被编译为动态链接库 Detect Plugin 直接编译到最终的可执行文件
    4 异步推理(context.enqueue) 同步推理(context.executeV2) 作者亲测在速度方面无差别,同步写法更简便
    5 INT8量化时,采用OpenCV的dnn模块将图像转换为张量 INT8量化时,自定义的方法将图像转换为张量
    6 C++加opencv实现预处理 cuda编程实现预处理加速 v5.0之后的版本也有,两种不同的实现

    除上述外,还有很多其他编码上的不同,不一一赘述。

    推理速度

    • 基于GPU:GeForce RTX 2080 Ti
    FP32 FP16 INT8
    6 ms 3 ms 3 ms

    备注:本项目的推理时间包括:预处理、前向传播、后处理,tensorrtx 项目仅计算了前向传播时间。

    环境构建

    宿主机基础环境

    • Ubuntu 16.04
    • GPU:GeForce RTX 2080 Ti
    • docker,nvidia-docker

    基础镜像拉取

    docker pull nvcr.io/nvidia/tensorrt:22.04-py3
    • 该镜像中各种环境版本如下:
    CUDA cuDNN TensorRT python
    11.6.2 8.4.0.27 8.2.4.2 3.8.10

    安装其他库

    1. 创建 docker 容器

      docker run -it --gpus device=0 --shm-size 32G -v /home:/workspace nvcr.io/nvidia/tensorrt:22.04-py3 bash

      其中-v /home:/workspace将宿主机的/home目录挂载到容器中,方便一些文件的交互,也可以选择其他目录

      • 将容器的源换成国内源
      cd /etc/apt
      rm sources.list
      vim sources.list
      • 将下面内容拷贝到文件sources.list
      deb http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
      deb http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
      deb http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
      deb http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
      deb http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse
      deb-src http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
      deb-src http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
      deb-src http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
      deb-src http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
      deb-src http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse
      • 更新源
      apt update
    2. 安装 OpenCV-4.5.0

      • OpenCV-4.5.0源码链接如下,下载 zip 包,解压后放到宿主机/home目录下,即容器的/workspace目录下
      https://github.com/opencv/opencv
      • 下面操作均在容器中
      # 安装依赖
      apt install build-essential
      apt install libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev
      apt install libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libdc1394-22-dev
      # 开始安装 OpenCV
      cd /workspace/opencv-4.5.0
      mkdir build
      cd build
      cmake -D CMAKE_INSTALL_PREFIX=/usr/local -D CMAKE_BUILD_TYPE=Release -D OPENCV_GENERATE_PKGCONFIG=ON -D OPENCV_ENABLE_NONFREE=True ..
      make -j6
      make install

    运行项目

    1. 获取 .wts 文件
    • 主要过程为:把本项目的pth2wts.py文件复制到官方yolov5-v5.0目录下,在官方yolov5-v5.0目录下运行 python pth2wts.py,得到para.wts文件
    • 具体过程可参考下面步骤
    git clone -b v5.0 https://github.com/ultralytics/yolov5.git
    git clone https://github.com/emptysoal/yolov5-v5.0_tensorrt-v8.2.git
    # download https://github.com/ultralytics/yolov5/releases/download/v5.0/yolov5s.pt
    cp {tensorrt}/pth2wts.py {ultralytics}/yolov5
    cd {ultralytics}/yolov5
    python pth2wts.py
    # a file 'para.wts' will be generated.
    1. 构建 .plan 序列化文件并推理
    • 主要过程为:把上一步生成的para.wts文件复制到本项目目录下,在本项目中依次运行make./trt_infer
    • 具体过程可参考下面步骤
    cp {ultralytics}/yolov5/para.wts {tensorrt}/
    cd {tensorrt}/
    mkdir images  # and put some images in it
    # update CLASS_NUM in yololayer.h if your model is trained on custom dataset
    # you can also update INPUT_H、INPUT_W in yololayer.h, update NET(s/m/l/x) in trt_infer.cpp
    make
    ./trt_infer
    # result images will be generated in present dir
    Visit original content creator repository https://github.com/emptysoal/TensorRT-v8-YOLOv5-v5.0
  • ecoleta-nlw-01

    NextLevelWeek

    Ecoleta

    🚀 Next Level Week 1.0 – Ecoleta

    GitHub language count Repository size GitHub last commit Repository issues License

    Projeto   |    Layout   |    Tecnologias   |    Executando   |    Licença


    💻 Projeto

    Esse projeto foi desenvolvido durante a Next Level Week promovida pela Rocketseat. Trata-se de um projeto fullstack para uma empresa de coleta (orgânica e inorgânica) ficticia, a Ecoleta, o projeto consiste na parte frontend(React), mobile(React Native) e backend(NodeJs) – juntamente com typescript em todas as partes.

    🎨 Layout

    Você pode utilizar a seguinte URL para visualizar todas as telas: Visualizar

    🚀 Tecnologias

    Esse projeto foi desenvolvido com as seguintes tecnologias:

    🚀 Executando

    Para rodar esse projeto existem duas maneiras para você escolher, a Live Version que é a versão que está rodando em nuvem para que você conferir, e a versão em Localhost, que é onde você baixa o projeto completo e roda a partir do seu terminal.

    Local Host:

    Também é possível rodar essa aplicação localmente em um computador ou notebook, para isso é necessário ter alguns programas instalados:

    Node.js e NPM

    É necessário instalar o node e npm(ou yarn) para rodar essa aplicação localmente. Para verificar se já tem instalado, execulte em seu terminal os seguintes comandos:

    node -v
    npm -v

    Caso o retorno de algum dos dois comandos sejam um erro, é necessário instalar o Node.js e o NPM.

    Guias para download e instalação do Node.js e NPM:

    Para download: acesse o site oficial do Node.js e siga o passo a passo.

    Com o Node.js e NPM instalados corretamente, abra o seu terminal na pasta do projeto e execute o seguinte comando:

    Inicie o servidor:

    npm start

    Aguarde alguns instantes.

    Caso tudo dê certo, o resultado esperado é abrir um servidor na porta 3333, você pode acessar em http://localhost:3333/.

    📝 Licença

    Esse projeto está sob a licença MIT. Veja o arquivo LICENSE para mais detalhes.


    Feito com horas em frente ao 💻 por Rodrigo Engelberg

    Visit original content creator repository https://github.com/rodrigoengelberg/ecoleta-nlw-01
  • ecoleta-nlw-01

    NextLevelWeek

    Ecoleta

    🚀 Next Level Week 1.0 – Ecoleta

    GitHub language count Repository size GitHub last commit Repository issues License

    Projeto   |    Layout   |    Tecnologias   |    Executando   |    Licença


    💻 Projeto

    Esse projeto foi desenvolvido durante a Next Level Week promovida pela Rocketseat. Trata-se de um projeto fullstack para uma empresa de coleta (orgânica e inorgânica) ficticia, a Ecoleta, o projeto consiste na parte frontend(React), mobile(React Native) e backend(NodeJs) – juntamente com typescript em todas as partes.

    🎨 Layout

    Você pode utilizar a seguinte URL para visualizar todas as telas: Visualizar

    🚀 Tecnologias

    Esse projeto foi desenvolvido com as seguintes tecnologias:

    🚀 Executando

    Para rodar esse projeto existem duas maneiras para você escolher, a Live Version que é a versão que está rodando em nuvem para que você conferir, e a versão em Localhost, que é onde você baixa o projeto completo e roda a partir do seu terminal.

    Local Host:

    Também é possível rodar essa aplicação localmente em um computador ou notebook, para isso é necessário ter alguns programas instalados:

    Node.js e NPM

    É necessário instalar o node e npm(ou yarn) para rodar essa aplicação localmente. Para verificar se já tem instalado, execulte em seu terminal os seguintes comandos:

    node -v
    npm -v

    Caso o retorno de algum dos dois comandos sejam um erro, é necessário instalar o Node.js e o NPM.

    Guias para download e instalação do Node.js e NPM:

    Para download: acesse o site oficial do Node.js e siga o passo a passo.

    Com o Node.js e NPM instalados corretamente, abra o seu terminal na pasta do projeto e execute o seguinte comando:

    Inicie o servidor:

    npm start

    Aguarde alguns instantes.

    Caso tudo dê certo, o resultado esperado é abrir um servidor na porta 3333, você pode acessar em http://localhost:3333/.

    📝 Licença

    Esse projeto está sob a licença MIT. Veja o arquivo LICENSE para mais detalhes.


    Feito com horas em frente ao 💻 por Rodrigo Engelberg

    Visit original content creator repository https://github.com/rodrigoengelberg/ecoleta-nlw-01
  • slice-base-sargs2multislice

    About stdlib…

    We believe in a future in which the web is a preferred environment for numerical computation. To help realize this future, we’ve built stdlib. stdlib is a standard library, with an emphasis on numerical and scientific computation, written in JavaScript (and C) for execution in browsers and in Node.js.

    The library is fully decomposable, being architected in such a way that you can swap out and mix and match APIs and functionality to cater to your exact preferences and use cases.

    When you use stdlib, you can be absolutely certain that you are using the most thorough, rigorous, well-written, studied, documented, tested, measured, and high-quality code out there.

    To join us in bringing numerical computing to the web, get started by checking us out on GitHub, and please consider financially supporting stdlib. We greatly appreciate your continued support!

    sargs2multislice

    NPM version Build Status Coverage Status

    Create a MultiSlice object from a comma-separated list of string-serialized MultiSlice constructor arguments.

    Installation

    npm install @stdlib/slice-base-sargs2multislice

    Alternatively,

    • To load the package in a website via a script tag without installation and bundlers, use the ES Module available on the esm branch (see README).
    • If you are using Deno, visit the deno branch (see README for usage intructions).
    • For use in Observable, or in browser/node environments, use the Universal Module Definition (UMD) build available on the umd branch (see README).

    The branches.md file summarizes the available branches and displays a diagram illustrating their relationships.

    To view installation and usage instructions specific to each branch build, be sure to explicitly navigate to the respective README files on each branch, as linked to above.

    Usage

    var sargs2multislice = require( '@stdlib/slice-base-sargs2multislice' );

    sargs2multislice( str )

    Creates a MultiSlice object from a comma-separated list of string-serialized MultiSlice constructor arguments.

    var s = sargs2multislice( '0,Slice(2,10,1),1' );
    // returns <MultiSlice>
    
    var d = s.data;
    // returns [ 0, <Slice>, 1 ]

    The function returns null if provided an invalid string.

    var s = sargs2multislice( 'foo,bar' );
    // returns null

    Notes

    • This function is useful from wanting to create a MultiSlice object from an array of constructor arguments which has been serialized to a string (e.g., when working with Proxy objects supporting slicing].

      var Slice = require( '@stdlib/slice-ctor' );
      
      var args = [ 0, new Slice( 2, 10, 1 ), 1 ];
      
      // ...
      
      var sargs = args.toString();
      // returns '0,Slice(2,10,1),1'
      
      // ...
      
      var s = sargs2multislice( sargs );
      // returns <MultiSlice>
      
      var d = s.data;
      // returns [ 0, <Slice>, 1 ]

    Examples

    var sargs2multislice = require( '@stdlib/slice-base-sargs2multislice' );
    
    var s = sargs2multislice( 'null,null,null' );
    var d = s.data;
    // returns [ null, null, null ]
    
    s = sargs2multislice( '10,Slice(2,10,1),null' );
    d = s.data;
    // returns [ 10, <Slice>, null ]
    
    s = sargs2multislice( '2,Slice(2,10,1),-5' );
    d = s.data;
    // returns [ 2, <Slice>, -5 ]
    
    s = sargs2multislice( 'foo,bar' );
    // returns null

    See Also


    Notice

    This package is part of stdlib, a standard library for JavaScript and Node.js, with an emphasis on numerical and scientific computing. The library provides a collection of robust, high performance libraries for mathematics, statistics, streams, utilities, and more.

    For more information on the project, filing bug reports and feature requests, and guidance on how to develop stdlib, see the main project repository.

    Community

    Chat


    License

    See LICENSE.

    Copyright

    Copyright © 2016-2025. The Stdlib Authors.

    Visit original content creator repository https://github.com/stdlib-js/slice-base-sargs2multislice
  • CheckpointBot

    Checkpoint Bot

    The bot for Checkpoint 1

    This is a bot to help with tournament and verification season management for Checkpoint 1. It is partially derived from my rewrite of Radia, however, there are still many custom-built cogs.

    usage

    This is for staff members, for non-staff members, type !help.

    season cog

    Players are verified based on their ranks, and will recieve a rank role. Seasons are used to avoid outdated ranks, at the start of each new season, all of the rank roles are replaced.

    Additionally, a Verified role is used to handle channel permissions and channel access. This role is not replaced each season. However, it may be pruned every couple of seasons.

    Here is how the verification process works.

    1. A user makes a role request in the respective channel, showing proof of their rank and asking for the correct role.
    2. A Barista verifies the user and gives them the Verified role and their respective rank role.

    The !season command group is used to manage season rank roles.

    • !season List the season roles. The command will also tell you the index of each tournament, this is important.
      • Alternatively, as an alias, you can use !season roles for the same result.
    • !season delete Removes the old season roles.
      • If you are afraid it will delete the wrong roles, you can double-check what roles it will be deleting by doing !season or !season list.
    • !season new <name> [delete=False] Creates new season roles, you are required to specify the name of the season.
      • <name>: Any name works, but if you specify a literal season (such as ‘winter’), it will automatically convert it into an emoji.
      • [delete=False]: Additionally, you can specify to automatically call delete old season roles before creating new roles (!season new spring true).
    • !season prune Removes the Verified role from anyone without a season rank role.
      • This is helpful every couple of seasons when you want to remove the verified role from those who have neglected to request a role for a while.

    tourney cog

    • !whatis is used to quickly look up glossary terms, and tournament rules, such as “dc”, “swiss”, and “glossary” (yes, the glossary includes a glossary). Requires the google.json file (see Google Setup).

    These commands are used like !whatis battlefy, if you omit the argument, the bot will automatically list all the possible options.

    local setup

    1. Make sure you have Docker installed.
    2. A Google API project for the bot.

    Google setup

    1. Enable the following API
    2. Go to the API & Services and navigate to credentials tab
    3. Click on + create credentials and create a new Service Accounts fill in the necessary field.
      • When you get to Role give it editor.
    4. Download the credentials files and rename it google.json
    5. Share the Google Sheet with the client_email from the json file.
    6. Copy the gsheet key from the url at https://docs.google.com/spreadsheets/d/{key}/edit, you will use this in the .env

    bot setup

    1. Create a .env in the repository root:

      TOKEN = discord.bot.token
      GSHEET = gsheet_key
      SENTRY = "System Environment"  # Optional
      DEBUG = 1  # Optional

      Please know that there are no true or false values in .env files. If you want to set a key to false, set it to 0

    2. Run docker-compose up in the repository root.


    empathy included • @cysabicysabi.github.io

    Visit original content creator repository
    https://github.com/cysabi/CheckpointBot

  • StuConect

    StuConect

    Description:

    The Project aims to provide constant communication between student group and authority group. Functionalities includes information sharing, real time location sharing and message sharing. Google, Facebook and Firebase API’s were used to support the project.

    Technical details:

    —-backend—-
    Firebase API acts as a backend server for this project.

    —-UI—-
    Bottom Navigation is used to switch between news feed, list friends, map friend, chat activities (Fragments). Google Maps for locating the user. List Views and custom adapters were used for news feed and chat activities.

    —-Login—-
    To perform login implementations, Google and Facebook login API’s and email method from firebase were integrated into the projects. To identify each user we are generatingg token id and storing it in Firebase database.

    —-Authentication—-
    Firebase authentication is used to

    —-news feed—-
    For information sharing module, list view with custom adapter were used for better implementation. It can display the post of the user along with the timestamp and the name of the user. Floating action button acts as starting point for the user to initiate his posting process.

    —-Maps—-
    Google Maps API were to point the location of the particular user. Markers were used to show info about the user i.e name and timestamp.
    Service class was used to get the location details of the user in background at the time interval of 60 seconds.

    —-chat—-
    Listview, by which the messages of the were displayed. FirebaseListAdapter gets the work done with ease. Messages are stored in the Firebase’s realtime databases.

    —-profile—-
    Just to know about the registered user with their account type, email and name.

    Visit original content creator repository
    https://github.com/Roopan14/StuConect

  • main-thread-scheduling


    main-thread-scheduling

    Fast and consistently responsive apps using a single function call

    Gzipped Size Build Status


    Install

    npm install main-thread-scheduling

    Overview

    The library lets you run computationally heavy tasks on the main thread while ensuring:

    • Your app’s UI doesn’t freeze.
    • Your users’ computer fans don’t spin.
    • Your INP (Interaction to Next Paint) is in green.
    • It’s easy to plug it into your existing codebase.

    A real world showcase of searching in a folder with 10k notes, 200k+ lines of text, that take 50MB on disk and getting results instantly.

    Use Cases

    • You want to turn a synchronous function into a non-blocking asynchronous function. Avoids UI freezes.
    • You want to render important elements first and less urgent ones second. Improves perceived performance.
    • You want to run a long background task that doesn’t spin the fans after a while. Avoids bad reputation.
    • You want to run multiple backgrounds tasks that don’t degrade your app performance with time. Prevents death by a thousand cuts.

    How It Works

    • Uses requestIdleCallback() and requestAfterFrame() for scheduling.
    • Stops task execution when user interacts with the UI (if navigator.scheduling.isInputPending() API is available).
    • Global queue. Multiple tasks are executed one by one so increasing the number of tasks doesn’t degrade performance linearly.
    • Sorts tasks by importance. Sorts by strategy and gives priority to tasks requested later.
    • Considerate about your existing code. Tasks with idle strategy are executed last so there isn’t some unexpected work that slows down the main thread after the background task is finished.

    Why

    • Simple. 90% of the time you only need the yieldOrContinue(strategy) function. The API has two more functions for more advanced cases.
    • Production ready. Actively maintained for three years — see contributors page. I’ve been using it in my own products for over four years — Nota and iBar. Flux.ai are also using it in their product (software for designing hardware circuits using web technologies).
    • This is the future. Some browsers have already implemented support for scheduling tasks on the main thread. This library tries even harder to improve user perceived performance — see explanation for details.
    • High quality. Aiming for high-quality with my open-source principles.

    Example

    You can see the library in action in this CodeSandbox. Try removing the call to yieldToContinue() and then type in the input to see the difference.

    API

    yieldOrContinue(strategy: 'interactive' | 'smooth' | 'idle', signal?: AbortSignal)

    The complexity of the entire library is hidden behind this method. You can have great app performance by calling a single method.

    async function findInFiles(query: string) {  
        for (const file of files) {
            await yieldOrContinue('interactive')
            
            for (const line of file.lines) {
                fuzzySearchLine(line, query)
            }
        }
    }

    scheduleTask(callback: () => T, { strategy, signal }): T

    This mimics the API style of scheduler.postTask() while providing the extra benefits of main-thread-scheduling.

    const controller = new AbortController()
    const result = await scheduleTask(() => {
        return computeHeavyCalculation()
    }, {
        strategy: 'smooth',
        signal: controller.signal,
    })

    More complex scenarios

    The library has two more functions available:

    • yieldControl(strategy: 'interactive' | 'smooth' | 'idle', signal?: AbortSignal)
    • isTimeToYield(strategy: 'interactive' | 'smooth' | 'idle', signal?: AbortSignal)

    These two functions are used together to handle more advanced use cases.

    A simple use case where you will need those two functions is when you want to render your view before yielding back control to the browser to continue its work:

    async function doHeavyWork() {
        for (const value of values) {
            if (isTimeToYield('interactive')) {
                render()
                await yieldControl('interactive')
            }
            
            computeHeavyWorkOnValue(value)
        }
    }

    Scheduling strategies

    There are three scheduling strategies available. You can think about them more easily by completing the sentence with one of the three words: “Scheduling the task keeps the page interactive/smooth/idle.”

    • interactive – use this for things that need to display to the user as fast as possible. Every interactive task is run for 83ms – this gives you a nice cycle of doing heavy work and letting the browser render pending changes.
    • smooth — use this for things you want to display to the user quickly but you still want for animations to run smoothly for example. smooth runs for 13ms and then gives around 3ms to render the frame.
    • idle – use this for background tasks. Every idle task is run for 5ms.

    Alternatives

    Web Workers

    Web Workers are a great fit if you have: 1) heavy algorithm (e.g. image processing), 2) heavy process (runs for a long time, big part of the app lifecycle). However, in reality, it’s rare to see people using them. That’s because they require significant investment of time due to the complexity that can’t be avoided when working with CPU threads regardless of the programming language. This library can be used as a gateway before transitioning to Web Workers. In most cases, you would discover the doing it on the main thread is good enough.

    • The calculation requires a lot state/data. Transferring that data takes too much time and the tradeoff isn’t worth it.
    • A lot of tiny calculations in between code that can’t run in a web worker. Running the tiny calculation alone doesn’t provide a benefit. You still want to not block the UI.
    • Already existing features that need restructuring to accommodate a web worker implementation might not be worth it for now. However, a quick toss in of yieldOrContinue() in an async function might a quick gain until the product is ready to adopt a more complicated solution.
    • Web Workers are harder to maintain (why)?
    • ![[CleanShot 2024-08-14 at 16.14.47@2x.png]]

    scheduler.postTask()

    scheduler.postTask() is available in some browsers today. postTask() and main-thread-scheduling do similar things. You can think of postTask() as a lower level API — it might be the right choice in specific scenarios. Library owners might be interested in exploring the nuanced differences between the two. For most cases, main-thread-scheduling provides a scheduleTask() method that mimics that API of postTask() while providing the extra benefits of the library.

    Need help?

    Need help with performance or consulting on how to integrate main-thread-scheduling in your project? Write to me at hello@astoilkov.com.

    Visit original content creator repository https://github.com/astoilkov/main-thread-scheduling
  • SBUS2-Telemetry

    SBUS2-Telemetry Library

    Arduino Library for receiving SBUS and SBUS2 Frames and transmit Telemetry Data

    • with Atmega328P MCU
      • The Library uses the U(S)ART Interrupt and Timer2
      • You can’t use Serial in your Sketch! Please use Softserial instead.
    • with ESP32
      • The Library uses Serial1 (UART_NUM_1) and TIMER_1 in TIMER_GROUP_0
      • RX1 = GPIO_NUM_25 and TX1 = GPIO_NUM_26
      • Serial, Timer and RX/TX Pins can be changed

    Setting up the Example Sketch

    • Before you start flashing and coding, please Setup you Futaba Radio Control
      • Select Modulation-> FASSTest -> 14CH and bind your Receiver
      • You should see the SerialNumber of the Receiver
      • In Telemetry Screen, you should see the RX Voltage
      • Got to “Sensors” Menu and activate the Inactive Slots
      • See #defines in the example Sketch -> #define TEMPRATURE_SLOT 1
      • Set Slot1 (Inactive) to Temp125 Sensor
      • Do the same with all other Sensors defined in the Example Sketch
    • Build up you Arduino Hardware with Atmega328P
      • Build your Inverter Circuit
      • Flash the Sktech -> On Arduino Pro Mini you must flash without attached Inverter!
      • Attach The Inverter to RX and TX on the Arduino Board
      • Attach the Inverter to SBUS2 Port on your Receiver
    • Build up you Arduino Hardware with ESP32
      • Place 1k between your RX1 and TX1 Pins
      • Attach the SBUS2 Port to RX1 Pin
      • Flash the Sktech
    • Power up
      • You can power your Futaba Receiver from Arduino (5V)
      • Or you power your Arduino from Futaba Receiver (BEC)
      • You will See Telemetry Data on your Telemetry Screen

    Setting up a custom Sketch

    You can’t set every Sensor to every Slot! There are Sensors which use 1 Slot and other Sensors use 3 or 8 Slots.

    After receiving the SBUS Frame it’s possible to transmit 8 Telemetry Slots.

    There are 32 different Telemetry Slots available.

    Example

    • Slot 0 is always RX Voltage-> So you cant use Slot0 for custom Sensors
    • So you have 7 Slots left (Slot 1 to Slot 7)
    • So you cant use a 8 Slot Sensor (GPS) on Slot 1!
    • GPS has to be on Slot 8, or Slot 16, or Slot 24
    • On Slot 1 you could set a Temperature Sensor (1 Slot Sensor)
    • On Slot 2 you could set a RPM Sensor (1 Slot Sensor)
    • On Slot 3 you could set a Power Sensor (3 Slot Sensor)
    • The Power Sensor uses Slot 3, Slot 4 and Slot 5 (3 Slot’s!)
    • The next free Slot would be Slot 6
    • Slot 6 can’t be a Power Sensor, because you have just 2 Slots free!

    The easiest Way for a Working Setup

    • First set all Sensors in your Futaba Radio Control
    • Your Futaba System will keep the things clear.
    • After you set up all your Favorit Sensors in your Radio, you can #define the used Slot’s in the Sketch
    • Take a look which Radio support which Sensor PDF

    Structure

    You can use every Sensor as often you want. But you have a maximum of 31 Sensor Slots If yo want to have multiple Temp Sensors, just change the Slot Number:

    • send_temp125(TEMPRATURE_SLOT1, (int16_t)50);
    • send_temp125(TEMPRATURE_SLOT2, (int16_t)80);
    • send_temp125(TEMPRATURE_SLOT3, (int16_t)20);
    • send_temp125(TEMPRATURE_SLOT4, (int16_t)999);
    • The same for all other Sensors The Sensor Values are updated with your Loop() cycle time.
    • If your loop() contains a Delay(2000), your Telemetry Data will be updated every 2sek
    • If your loop() has no Delay’s, your Telemetry Data will be updated every 60ms (every 4 SBUS Frames)
    • The Servo Channel Data is updated every 15ms (with every new SBUS(2) Frame)

    Stand alone SBUS Library

    This Library can be used just for getting Servo Channel Data. The Library just send telemetry Data to SBUS2 compatible Receiver.

    • SBUS_Ready() -> True when new Servo data is received with SBUS and SBUS2 Frames
    • SBUS2_Ready() -> True when SBUS2 Frames are received
    • SBUS2_get_status() will also work with just SBUS Frames

    Supported Sensors

    • Temp125 -> Temperatures from -16384°C to + 16383°C
    • SBS/01TE -> Temperatures from -16384°C to + 16383°C
    • F1713 -> Temperatures from -16384°C to + 16383°C
    • SBS/01T -> Temperatures from -100°C to + 32667°C
    • SBS-01RM -> Rotations per Minute (RPM) from 0 to 393210
    • SBS/01RB -> Rotations per Minute (RPM) from 0 to 393210
    • SBS-01RO -> Rotations per Minute (RPM) from 0 to 393210
    • Curr-1678 -> Voltage (0V to 655.3V) / Current (0A to 163.0A) / Capacity Sensor (0mAh to 32767 mAh)
    • F1678 -> Voltage (0V to 655.3V) / Current (0A to 163.0A) / Capacity Sensor (0mAh to 32767 mAh)
    • GPS-1675 -> Speed (0km/h to 999km/h) / Altitude (-16384m to +16383m) / Vario (-3276.8m/s to +3276.7m/s) / LON&LAT (Degree Minutes)
    • F1675 -> Speed (0km/h to 999km/h) / Altitude (-16384m to +16383m) / Vario (-3276.8m/s to +3276.7m/s) / LON&LAT (Degree Minutes)
    • SBS-01G -> Speed (0km/h to 511km/h) / Altitude (-820m to +4830m) / Vario (-150m/s to +260m/s) / LON&LAT (Degree Minutes)
    • SBS/01V -> Voltage (0V to 819.1V) / Voltage (0V to 819.1V)
    • F1672 -> Vario (-327.6m/s to +327.6m/s) / Altitude (-16384m to +16383m)
    • F1712 -> Vario (-3276,8m/s to +3276.7m/s) / Altitude (-16384m to +16383m)

    Unsupported Sensors: Issues

    • SBS/01S
    • SBS-0 1 TAS
    • F1677
    • SBS/01C
    • P-SBS/01T
    • SBS/02A
    • SBS/01A

    Supported Radio Systems

    • T14SG
    • T18MZ
    • T10J

    Supported Receivers

    • R7003SB
    • R7008SB
    • R3006SB

    Supported MCU

    • Arduino Pro Mini 8MHz (with external Inverters)
    • Arduino Pro Mini 16MHz (with external Inverters)
    • ESP32

    Inverter Schematic for Atmega328P (Arduino Pro mini)

    correct inverter

    Guide for Library Development

    The Futaba Telemetry Protocol has very hard Timings:

    • Slot 0 must be send 2ms after the last Byte of the SBUS(2) Frame
    • same for Slot 8, Slot 16 and Slot 24
    • After every Slot’s there must be 325µs Pause to the next Slot
    • So you have to receive the SBUS Frame in a UART Interrupt
    • And with the Last Byte of the Frame you have to Start a Timer with 2ms
    • With every Timer Interrupt you have to set the next Timer Interrupt to 660µs
    • If the Timer doesn’t work in this conditions, you do not need to start developing the Slot data transmission
    • You need to have a Logic Analyser or Oscilloscope
    • The best way is to toggle some Pins to check the correct timing
    • SBUS(2) Level Voltage is 3,3V -> Do not work with 5V Level Signals!

    Aditional Informations about Futaba’s SBUS and SBUS2

    • SBUS(2) is 100000 Baud with 8 data bits, even parity and 2 stop bits
    • SBUS(2) Signal is inverted UART
    • SBUS is a Frame with 25 Bytes
      • Byte [0] is 0x0F
      • Byte [1-22] is Servo Channel Data
      • Byte [23] is DigiChannel 17&18 + Stus Bits
      • Byte [24] is 0x00
    • SBUS2 is a SBUS Frame but with a diffent End Byte (Byte24)
      • Byte [24] is 0x04, 0x14, 0x24, 0x34
      • Byte 24 controls the Number of the Telemetry Slot’s
        • 0x04 -> Slot 0 to Slot 7
        • 0x14 -> Slot 8 to Slot 15
        • 0x24 -> Slot 16 to Slot 23
        • 0x34 -> Slot 24 to Slot 31

    Version

    0.1 created

    0.2 Inverter instead of 3-State Buffer

    0.3 16MHz support, new API

    1.0 Pre-Release

    1.0 Release

    1.1 ESP32 Support

    1.2 Available in Arduino Library Manager

    Credentials

    Bart Keser with his Castle Creation Telemetry to Futaba Telemetry Converter

    Alex K. and his Development

    Visit original content creator repository https://github.com/BrushlessPower/SBUS2-Telemetry
  • StringsPatcher

    [DEPRECATED]

    ⚠️ This repository is no longer actively maintained ⚠️

    Strings Patcher

    It’s very common (at least in our company) that just after a new release, someone discovers a translation that is wrong, or even missing in a specific language (we support over 10 different languages).

    When this happens we have 2 options: Create a patch release with the missing translation and release again, whith all the work and overhead that it implies, or wait until next week for a release with the translation fixed.

    Of course, most of this problems can be avoided by being very careful when adding a new translation, but we are humans and we make mistakes. This library has been made for making those mistakes a little bit less painful.

    Features

    • Update (Patch) String values on the fly.
    • Serverless implementation, relying on Google Spreadsheets for data storage

    Setup

    Adding the library to your project

    Add to top level gradle.build file

    allprojects {
        repositories {
            maven { url "https://jitpack.io" }
        }
    }
    

    Add to app module gradle.build file

    dependencies {
        compile 'com.github.cookpad:StringsPatcher:0.1.0'
    }
    

    Setting up the spreadhseet

    We recomend that, if you can, you use the first setup (A). The drawback is that your string patches spreadhseet has to be public. If you want your spreadsheets to be private use the setup (B).

    A – (Easy) Public spreadhseet
    1 Getting the spreadsheet key
    • Go to google Drive and create a new spreadsheet
    • Add this 3 words into the first 3 columns, on the firt row: lang, key & value
    captura de pantalla 2017-07-06 a las 10 57 11 – Click on File > Publish to the web > Publish – Copy the Spreadsheet key which is a long string of numbers and letters that you can get from the displayed url `https://docs.google.com/a/cookpad.jp/spreadsheets/d/[spreadsheet-id]/pubhtml` and don’t loose it, you’ll need it when setting up the library.
    2 Other spreasheet setups
    • Name your worksheet (bottom tab) with the same name as your android build number (this is optional but you have to, at least, name the worksheet with something you’ll remember)
    • [optional] You can create multiple worksheets named after each android build number, that way you can mantain different String patches for different versions of your app
    captura de pantalla 2017-07-06 a las 11 01 55
    B – (Hard) Private spreadhseet
    1 Getting the spreadsheet key
    • Go to google Drive a create a new spreadsheet
    • Add this 3 words into the first 3 columns, on the firt row: lang, key & value
    • Copy the Spreadsheet key which is a long string of numbers and letters that you can get from the url https://docs.google.com/spreadsheets/d/[spreadsheet-id]/edit#gid=0 and don’t loose it, you’ll need it when setting up the library.
    captura de pantalla 2017-07-06 a las 10 57 11
    2 Other spreasheet setups
    • Name your worksheet (bottom tab) with the same name as your android build number (this is optional but you have to, at least, name the worksheet with something you’ll remember)
    • [optional] You can create multiple worksheets named after each android build number, that way you can mantain different String patches for different versions of your app
    captura de pantalla 2017-07-06 a las 11 01 55
    3 Getting Google App credentials
    • Go to Google Dev Console https://console.developers.google.com and create a new App
    • In Dashboard enable: Google Sheets in order to access the spreadsheets
    • Now go to Credentials > Create Credential > OAuth client ID > Web application
    • Add http://localhost to Authorized JavaScript origins
    • Also add http://localhost and https://developers.google.com/oauthplayground to Authorized redirect URIs
    • Copy somewhere your client ID and Secret
    4 Getting a Refresh Token
    captura de pantalla 2017-07-06 a las 11 39 30
    • Now on the left input bow that says input your own scopes type https://www.googleapis.com/auth/spreadsheets.readonly and press Authorize APIs
    captura de pantalla 2017-07-06 a las 11 48 38
    • You will see a google login form. Login with the same user that you used to create the spreadhseet. Press Allow (this only allows the app to read your spreadsheets, not modify anything)

    • Wait for the playground to load and then press Exchange authorization code for tokens

    • Copy your Refresh token somewhere, you’ll need it too. (You might need to press in Step 2Exchange authorization code for tokens to see the tokens)

    captura de pantalla 2017-07-06 a las 11 48 47
    • You should now have the client ID, client Secret and Refresh Token. You’re ready!

    Usage

    Call syncStringPatcher() in your Android Application class, supplying the context and your spreadSheetKey as the only mandatory parameters.

    class StringsPatcherApp : Application() {
        override fun onCreate() {
            super.onCreate()
    
            val spreadSheetKey = "YOUR_SPREAD_SHEET_KEY"
            syncStringPatcher(this, spreadSheetKey)
        }
    } 

    Programmatically

    syncStringPatcher expose next optional params:

    • worksheetName: the worksheet name (The spreadsheet may be composed by several worksheets). This param has as default value the versionCode of the application. That way, your spreadsheet should have as many worksheets as release versions (1,2,3,4,…).
    • locale: the locale used to filter strings. As default value the system locale is assigned.
    • logger: callback function to listen for errors emission. As default a dummy implementation does nothing.
    • resourcesClass: supply the auto-generated R.string::class of your app only if it is required patching strings set from xml layouts.
    • googleCredentials: only supply these credentials if the spreadSheet has private access.

    Once StringPatches has been initialized, in order to access the patches, you must retrieve string resources at runtime by calling either Context::getSmartString or Resources::getSmartString. For formatting strings, Context::getSmartString(formatArgs) and Resources::getSmartString(formatArgs) are exposed.

    If there is no patch for a given key, the library fallbacks to the system resources.

    XML layout

    It’s possible to patch strings which have been set from XML layouts (currently lib only supports text and hint property of any View extending from TextView). Call bindStringsPatchers supplying the View root from which you want to start replacing strings recursively (be aware that this may be a penalty performance for very nested views). An standard way of using it may be calling it onStart Activity event and supplying its view root, another more holistic approach would be using ActivityLifecycleCallbacks to configure the previous behaviour in a row for every single Activity of your application. For a working example, take a look to the app_test module.

    Debug

    By setting stringPatcherDebugEnabled as true the lib will append to the value its associated key, as such as a pencil emoji. This is intended to make clear which values are handled by the lib and, therefore, eligible to be updated by changing its value from the spreadsheet.

    Visit original content creator repository https://github.com/cookpad/StringsPatcher