Recently, distributed learning approaches have been studied for using data from multiple sources without sharing them, but they are not usually suitable in applications where each client carries out different tasks. Meanwhile, Transformer has been widely explored in computer vision area due to its capability to learn the common representation through global attention. By leveraging the advantages of Transformer, here we present a new distributed learning framework for multiple image processing tasks, allowing clients to learn distinct tasks with their local data. This arises from a disentangled representation of local and non-local features using a task-specific head/tail and a task-agnostic Vision Transformer. Each client learns a translation from its own task to a common representation using the task-specific networks, while the Transformer body on the server learns global attention between the features embedded in the representation. To enable decomposition between the task-specific and common representations, we propose an alternating training strategy between clients and server. Experimental results on distributed learning for various tasks show that our method synergistically improves the performance of each client with its own data.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TIP.2022.3226892 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!